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Abstract 

In a data exchange setting with target constraints, it 
is often the case that a given source instance has no 
solutions. Intuitively, this happens when data sources 
contain inconsistent or conflicting information that is 
exposed by the target constraints at hand. In such 
cases, the semantics of target queries trivialize, be¬ 
cause the certain answers of every target query over 
the given source instance evaluate to “true”. The aim 
of this paper is to introduce and explore a new frame¬ 
work that gives meaningful semantics in such cases by 
using the notion of exchange-repairs. Informally, an 
exchange-repair of a source instance is another source 
instance that differs minimally from the first, but has 
a solution. In turn, exchange-repairs give rise to a 
natural notion of exchange-repair certain answers (in 
short, XR-certain answers) for target queries in the 
context of data exchange with target constraints. 

After exploring the structural properties of 
exchange-repairs, we focus on the problem of 
computing the XR-certain answers of conjunctive 
queries. We show that for schema mappings speci¬ 
fied by source-to-target GAV dependencies and tar¬ 
get equality-generating dependencies (egds), the XR- 
certain answers of a target conjunctive query can be 
rewritten as the consistent answers (in the sense of 
standard database repairs) of a union of conjunctive 
queries over the source schema with respect to a set of 
egds over the source schema, thus making it possible 


to use a consistent query-answering system to com¬ 
pute XR-certain answers in data exchange. In con¬ 
trast, we show that this type of rewriting is not possi¬ 
ble for schema mappings specified by source-to-target 
LAV dependencies and target egds, nor for schema 
mappings specified by both source-to-target and tar¬ 
get GAV dependencies. We then examine the general 
case of schema mappings specified by source-to-target 
GLAV constraints, a weakly acyclic set of target tgds 
and a set of target egds. The main result asserts that, 
for such settings, the XR-certain answers of conjunc¬ 
tive queries can be rewritten as the certain answers 
of a union of conjunctive queries with respect to the 
stable models of a disjunctive logic program over a 
suitable expansion of the source schema. 

1 Introduction and Summary 
of Contributions 

Data exchange is the problem of transforming data 
structured under one schema, called the source 
schema, into data structured under a different 
schema, called the target schema, in such a way 
that pre-specified constraints on these two schemas 
are satisfied. Data exchange is a ubiquitous data 
inter-operability task that has been explored in depth 
during the past decade (see 0 ). This task is for¬ 
malized with the aid of schema mappings A4 = 
(S, T, Egt, St), where S is the source schema, T is 


I 


the target schema, Sgt is a set of constraints between 
S and T, and Et is a set of constraints on T. The 
most thoroughly investigated schema mappings are 
the ones in which Egt is a set of source-to-target tuple¬ 
generating dependencies (s-t tgds) and Et is a set 
of target tuple-generating dependencies (target tgds) 
and target equality-generating dependencies (target 
egds) [H]. An example of such a schema mapping, 
along with a target query, follows: 

Every schema mapping A4 = (S,T, Est,Et) gives 
rise to two distinct algorithmic problems. The first 
is the existence and construction of solutions: given 
a source instance I, determine whether a solution for 
I exists (i.e., a target instance J so that (/, J) satis¬ 
fies Egt UE^) and, if it does, construct such a “good” 
solution. The second is to compute the certain an¬ 
swers of target queries, where if g is a target query 
and / is a source instance, then certain(g,/, Af) is 
the intersection of the sets q{J), as J varies over 
all solutions for I. For arbitrary schema mappings 
specified by s-t tgds and target tgds and egds, both 
these problems can be undecidable [55]. However, as 
shown in [19] , if the set Et of target tgds obeys a mild 
structural condition, called weak acyclicity, then both 
these problems can be solved in polynomial time us¬ 
ing the chase procedure. Given a source instance /, 
the chase procedure attempts to build a “most gen¬ 
eral” solution J for / by generating facts that satisfy 
each s-t tgd and each target tgd as needed, and by 
equating two nulls or equating a null to a constant, 
as dictated by the egds. If the chase procedure en¬ 
counters an egd that equates two distinct constants, 
then it terminates and reports that no solution for / 
exists. Otherwise, it constructs a universal solution 
J for I, which can also be used to compute the cer¬ 
tain answers of conjunctive queries in time bounded 
by a polynomial in the size of I. 

Consider the situation in which the chase termi¬ 
nates and reports that no solution exists. In such 
cases, for every boolean target query q, the certain 
answers certain(g,/,Ad) evaluate to “true”. Even 
though the certain answers have become the standard 
semantics of queries in the data exchange context, 
there is clearly something unsatisfactory about this 
state of affairs, since the certain answers trivialize 
when no solutions exist. Intuitively, the root cause 


for the lack of solutions is that the source instance 
contains inconsistent or conflicting information that 
is exposed by the target constraints of the schema 
mapping at hand. In turn, this suggests that alterna¬ 
tive semantics for target queries could be obtained by 
adopting the notions of database repairs and eonsis- 
tent answers from the study of inconsistent databases 
(see [7] for an overview). We note that several dif¬ 
ferent types of repairs have been studied in the con¬ 
text of inconsistent databases; the most widely used 
ones are the symmetric difference ((B-repairs), which 
contain as special cases the subset-repairs and the 
superset-repairs. 

How can the notions of database repairs and consis¬ 
tent answers be adapted to the data exchange frame¬ 
work? When one reflects on this question, then one 
realizes that several different approaches are possible. 

One approach, which we call materialize-then- 
repair, is as follows: given a source instance, a target 
instance is produced by chasing with the source-to- 
target tgds in Egt and the target tgds in Et, while 
ignoring the target egds in Et. Since the target in¬ 
stance produced this way may very well violate the 
egds in Et, it is treated as an inconsistent instance 
w.r.t. Et; consider its repairs. Note that a similar ap¬ 
proach has been adopted by mm in the context of 
data integration. A different approach, which we call 
exchange-as-repair, treats the given source instance 
as an inconsistent instance over the combined schema 
S U T w.r.t. the union Egt U Et and considers its re¬ 
pairs. Note that this is in the spirit of [53], where 
instances in peer data exchange that do not satisfy 
the schema mapping at hand are treated as incon¬ 
sistent databases over a combined schema. We now 
point out that neither of these approaches gives rise 
to satisfactory semantics. 

Figure [5] gives an example of a target instance that 
is produced in the materialize-then-repair approach 
by chasing with the s-t tgds in Figure [1] Clearly, 
J is inconsistent because it violates the egd in Et. 
Consider now the subset repair J' in Figure [3] of 
our materialized target instance J (note that, in 
this case, symmetric difference repairs coincide with 
subset repairs). Notice that the repair J' places peter 
in the exec department, yet still has him performing 
tasks for the software department - the fact that the 


^ J Task_Assigiiments(j), t, d) —Departments(p, d)ATasks(p, i) 
Stakeholders_old(t, s) —Stakeholders_new(i, s) 

Et = { Department s(p, d) A Departments (p, d!) d = d! } 


hoss{person, stakeholder) = 3task. 

Tasks (person, tasfc) A Stakeholders_new(tasfc, stakeholder) 

Figure 1: A schema mapping Ai specified by tgds and egds, and a target query. In this example, the egd is 
actually a key constraint and there are no target tgds. 
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Figure 2: A source instance I and the inconsistent 
target instance J that results from chasing I with 
the tgds in M. 
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Figure 3: A symmetric-difference-repair of J w.r.t. 
St. 


“tpsreport” and “spaceout” tasks are derived from a 
tuple placing peter in the software department has 
been lost. The only other repair of J similarly fails 
to reflect the shared origin of tuples in the Tasks 
and Departments tables, and this disconnect in the 
materialize-then-repair approach manifests in the 
consistent answers to target queries. In this exam¬ 
ple, the consistent answers for boss(peter, b) are 
{(peter, bobs), (peter, portman), (peter, lumbergh)}. 
However, the last two tuples are derived from facts 
placing peter in the software department, even 
though in J' he is not. 

The situation is no better in the exchange-as- 
repair approach. Figure |4] depicts three repairs 
of this type (using symmetric difference seman¬ 
tics). While the first two repairs in Figure |4] 
seem reasonable, in the third we have eliminated 
Task_Assignments(peter, spaceout, software), even 
though our key constraint is already satisfied by 
the removal of Task_Assignments(peter, meetbobs, 
exec) alone. In this approach, the consistent answers 
of boss(peter, 6) are 0, despite the intuitive conclu¬ 
sion that peter should be performing tasks for the 
bobs regardless of which way we fix the department 
key constraint violation. For symmetric-difference re¬ 
pairs, it is equally valid to satisfy a violated tgd by 
removing tuples as by adding thenQ. However, in a 
data exchange setting, the target instance is initially 
empty, so it would be more natural to satisfy vio¬ 
lated tgds by deriving new tuples. This observation 
motivates the particulars of our approach, which we 

noteworthy alternative to symmetric difference repairs 
are the loosely-sound semantics of m, discussed in detail in 
Section 13.31 
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Figure 4: Three repairs of (1,0) (from Figure [2]) w.r.t 
brevity. 

introduce next. 

1.1 Summary of Contributions 

Our aim in this paper is to introduce and explore a 
new framework that gives meaningful and non-trivial 
semantics to queries in data exchange, including cases 
in which no solutions exist for a given source instance. 

At the conceptual level, the main contribution is 
the introduction of the notion of an exchange-repair. 
Informally, an exchange-repair of a source instance is 
another source instance that differs minimally from 
the first, but has a solution. Exchange-repairs give 
rise to a natural notion of exehange-repair certain 
answers (in short, XR-certain answers) for target 
queries in the context of data exchange. Note that 
if a source instance I has a solution, then the XR- 
certain answers of target queries on I coincide with 
the certain answers of the queries on I. If I has no 
solutions, then unlike the certain answers, the XR- 
certain answers are non-trivial and meaningful. 

We provide examples demonstrating that these 
new semantics improve upon both the materialize- 
then-repair approach and the exchange-as-rep air ap¬ 
proach discussed earlier. We also produce a de¬ 
tailed comparison of the XR-certain semantics with 
the main notions of inconsistency-tolerant semantics 
studied in data integration and in ontology-based 
data access. This comparison is carried out in Section 
13.21 after we have introduced our framework and pre¬ 
sented some basic structural properties of exchange- 


Sst U Et. The Stakeholders tables are omitted for 


repairs in Section [S] 

After this, we focus on the problem of computing 
the XR-certain answers of conjunctive queries. In 
Section |4l we show that for schema mappings speci¬ 
fied by source-to-target GAV (global-as-view) depen¬ 
dencies and target egds, the XR-certain answers of 
conjunctive queries can be rewritten as the consistent 
answers (in the sense of standard database repairs) of 
a union of conjunctive queries over the source schema 
with respect to a set of egds over the source schema, 
thus making it possible to use a consistent query¬ 
answering system to compute XR-certain answers in 
data exchange. In contrast, we show that this type of 
rewriting is not possible for schema mappings spec¬ 
ified by source-to-target LAV (local-as-view) depen¬ 
dencies and target egds, nor for schema mappings 
specified by source-to-target and target GAV depen¬ 
dencies and target egds. 

In Section[5l we examine the general case of schema 
mappings specified by s-t tgds, a weakly acyclic set 
of target tgds and a set of target egds. The main 
result is that, for such settings, the XR-certain an¬ 
swers of conjunctive queries can be rewritten as the 
certain answers of a union of conjunctive queries with 
respect to the stable models of a disjunctive logic pro¬ 
gram over a suitable expansion of the source schema. 
This is achieved in two steps. First, for schema map¬ 
pings consisting of GAV s-t tgds, GAV target tgds, 
and target egds, we show that the XR-certain an¬ 
swers of conjunctive queries can be reduced to cau¬ 
tious reasoning over stable models of a disjunctive 





















































logic program. Second, for schema mappings consist¬ 
ing of GLAV s-t tgds, weakly acyclic sets of GLAV 
target tgds, and target egds, we show that the XR- 
certain answers of conjunctive queries can be rewrit¬ 
ten as the XR-certain answers of conjunctive queries 
w.r.t. a schema mapping consisting of GAV s-t tgds, 
GAV target tgds, and target egds. In fact, we prove 
the stronger result that such a rewriting is possible 
for schema mappings specified by a second-order s-t 
tgd, a weakly acyclic second-order target tgd, and set 
of target egds. 


2 Preliminaries 

This section contains definitions of basic notions and 
a minimum amount of background material. Detailed 
information about schema mappings and certain an¬ 
swers can be found in mn], and about repairs and 
consistent answers in HH]. 


2.1 Instances and Homomorphisms 

Fix an infinite set Const of elements, and an infinite 
set Nulls of elements such that Const and Nulls are 
disjoint. A schema R is a finite set of relation sym¬ 
bols, each having a designated arity. An Ti-instance 
is a finite database I over the schema R whose active 
domain is a subset of Const U Nulls. A fact of an R- 
instance I is an expression of the form R{ai ,..., a/c), 
where i? is a relation symbol of arity fc in R and 
(ai ,... ,ak) is a member of the relation on I that 
interprets the relation symbol R. Every R-instance 
can be identified with the set of its facts. We say that 
an R-instance I' is a sub-instance of an R-instance I 
if /' C /, where I' and I are viewed as sets of facts. 

By a homomorphism between two instances K and 
K', we mean a map from the active domain of K to 
the active domain of K' that is the identity function 
on all elements of Const and such that for every atom 
R(ui ,€ K we have that R(/i(ni),...,/i(t>„)) € 
K'. 


2.2 Schema Mappings and Certain 
Answers. 

A tuple-generating dependency (tgd) is an expression 
of the form Vx((/)(x) 3yV'(x, y)), where ^(x) and 

'(/'(x, y) are conjunctions of atoms over some rela¬ 
tional schema. 

Tgds are also known as GLAV (global-and-local-as- 
view) constraints. Tgds with no existentially quan¬ 
tified variables are called full. Two important spe¬ 
cial cases are the GAV constraints and the LAV 
constraints: the former are the tgds of the form 
Vx(0(x) P{^)) and the latter are the tgds of the 

form Vx(i?(x) — >■ 3yi/)(x, y)), where P and R are in¬ 
dividual relation symbols. Every full tgd is logically 
equivalent to a set of GAV tgds that can be computed 
in linear time. 

Suppose we have two disjoint relational schemas 
S and T, called the source schema and the target 
schema. A source-to-target tgd {s-t tgd) is a tgd as 
above such that (/'(x) is a conjunction over S and 
'!/;(x, y) is a conjunction over T. When the schemas 
are understood from context, we may say just tgd 
even if the constraint is source-to-target. 

An eguality-generating dependency (egd) is an ex¬ 
pression of the form Vx((/)(x) Xi = Xj) with (/'(x) 
a conjunction of atoms over a relational schema. 

Eor the sake of readability, we will frequently drop 
universal quantifiers when writing tgds and egds. 

A schema mapping is a quadruple M = 
(S, T, Est, St), where S is a source schema, T is a 
target schema, Egt is a finite set of source-to-target 
constraints, and Et is a finite set of constraints over 
the target schema. 

We will use the notation GLAV, GAV, lav, egd 
to denote the classes of sets of constraints consist¬ 
ing of finite sets of, respectively, GLAV constraints, 
GAV constraints, LAV constraints, and egds. If C is 
a class of sets of source-to-target dependencies and 
D is a class of sets of target dependencies, then the 
notation C+D denotes the class of all schema map¬ 
pings M = (S,T, EgtjEt) such that Egt is a mem¬ 
ber of C and Et is a member of D. For example, 
GLAV+EGD denotes the class of all schema mappings 
A4 = (S, T, Est, Et) such that Egt is a finite set of s-t 
tgds and Et is a finite set of egds. Moreover, we will 


use the notation (Di, £> 2 ) to denote that the union of 
two classes Di and D 2 of sets of target dependencies. 
For example, GAV+(gav, egd) denotes the class of 
all schema mappings Ad = (S,T, Est,St) such that 
Sst is a set of GAV s-t tgds and Et is the union of 
a finite set of GAV target tgds with a finite set of 
target egds. 

Let Ad = (S,T, EstjEt) be a schema mapping. A 
target instance J is a solution for a source instance I 
w.r.t. Ad if J is finite, and the pair (/, J) satisfies Ad, 
i.e., I and J together satisfy Egt, and J satisfies Et. 
Recall that, by definition, instances are finite. Ad¬ 
ditionally, by convention, we will assume that source 
instances do not contain null values. A universal so¬ 
lution for / is a solution J for I such that if J' is a so¬ 
lution for /, then there is a homomorphism h from J 
to J' that is the identity on the active domain of /. If 
Ad = (S, T, Est, Et) is an arbitrary schema mapping, 
then a given source instance may have no solution or 
it may have a solution, but no (finite) universal solu¬ 
tion. However, if Et is the union of a weakly acyclic 
set of target tgds and a set of egds, then a solution 
exists if and only if a universal solution exists. More¬ 
over, the chase procedure can be used to determine 
if, given a source instance /, a solution for / exists 
and, if it does, to actually construct a universal so¬ 
lution chase{I,M) for / in time polynomial in the 
size of I (see [19] for details). The definition of weak 
acyclicity is given next, followed by the definition of 
the chase procedure. 

Definition 2.1 ([IS]). Let E be a set of tgds over 
a schema T. Gonstruct a directed graph, called the 
dependency graph, as follows: 

• Nodes: For every pair {R, A) with R a relation 
symbol in T and A an attribute of R, there is a 
distinct node; call such a pair {R, A) a position. 

• Edges: For every tgd Vx((/)(x) —>■ 3y'(/;(x, y)) in 
E and for every a: in x that occurs in ijj, and for 
every occurrence oi x in (p in position [R, Ai): 

1. For every occurrence of a; in '0 in position (S, 
Bj), add an edge {R,Ai) —>• {S,Bj) (if it does 
not already exist). 

2. For every existentially quantified variable y 
and for every occurrence of ?/ in 0 in position 
(T, Ck), add a special edge {R,Ai) —>■ {T,Ck) 


(^E.l. ^E.2 E.l^ - E^'^) 

(a) (b) 

Figure 5: The dependency graphs for (a) 

'ix'iy{E{x,y) 3zE{x,z)) and (b) '^x'iy{E{x,y) — 
3z E{y,z)). Special edges are dotted. 

(if it does not already exist). 

We say that E is weakly acyclic if the dependency 
graph has no cycle going through a special edge. 

WAGLAV denotes the class of all finite weakly 
acyclic sets of target tgds. 

The tgd VxVy{E{x,y) —>■ 3zE{x,z)) is weakly 
acyclic; in contrast, the tgd 'ix\/y{E{x,y) —>• 
3z E{y, z)) is not, because the dependency graph con¬ 
tains a special self-loop (see Figure [S]). Moreover, 
every set of GAV tgds is weakly acyclic, since the 
dependency graph contains no special edges in this 
case. 

What follows is the definition of the chase proce¬ 
dure. 

Definition 2.2 (chase procedure [H]). Let K be an 
instance. 

(tgd) Let d be a tgd 0(x) —> 3y0(x, y). Let h be 
a homomorphism from 0(x) to K such that there 
is no extension of A to a homomorphism h' from 
0(x) A 0(x, y) to K. We say that d can be applied 
to K with homomorphism h. 

Let K' be the union of K with the set of facts 
obtained by: (a) extending h to h' such that each 
variable in y is assigned a fresh labeled null, fol¬ 
lowed by (b) taking the image of the atoms of 0 
under h'. We say that the result of applying d to 
K with ft, is AT', and write K K'. 

(egd) Let d be an egd 0(x) —>• {xi = X 2 ). Let ft 
be a homomorphism from 0(x) to K such that 
ft(xi) 0 h{x 2 ). We say that d can be applied to K 
with homomorphism ft. We distinguish two cases. 
• If both h{xi) and h{x 2 ) are in Const then we 
say that the result of applying d to K with ft is 
“failure”, and write K ^ L. 




• Otherwise, let K' be K where we identify h{xi) 
and h{x 2 ) as follows: if one is a constant, then 
the labeled null is replaced everywhere by the 
constant; if both are labeled nulls, then one is 
replaced everywhere by the other. We say that 
the result of applying d to K with h is K', and 

write K K'. 

In the above, K —^ K' (including the case where 
K' is _L) is called a chase step. We now define chase 
sequences and finite chases. 

Let S be a set of tgds and egds, and let K be an 
instance. 

• A chase sequence of K with S is a sequence (finite 

or infinite) of chase steps Ki Ki+i, with 

i = 0, 1,..., with K = Kq and di a dependency in 

S. 

• A finite chase of K with S is a finite chase se¬ 
quence Ki ATi+i, 0 < i < m, with the re¬ 
quirement that either (a) Km = -L or (b) there 
is no dependency di of E and there is no homo¬ 
morphism hi such that di can be applied to Km 
with hi. We say that Km is the result of the finite 
chase. We refer to case (a) as the case of a failing 
finite chase and we refer to case (b) as the case of 
a successful finite chase. 

In the context of data exchange, we chase the 
source instance first with the source-to-target con¬ 
straints, and then continue chasing with the target 
constraints. The nature of s-t tgds ensure that no 
atoms are created over the source schema, so in this 
setting the result of chasing a source instance I with 
a schema mapping AI is a pair (/, J) where J is a 
target instance. We usually refer to J alone as the 
result of the chase. 

We will also make use of the notion of rank [19]. 
Let E be a finite weakly acyclic set of tgds. For ev¬ 
ery node {R, A) in the dependency graph of E, de¬ 
fine an incoming path to be any (finite or infinite) 
path ending in {R,A). Define the rank of (i?. A), 
denoted by rank(R, A), as the maximum number of 
special edges on any such incoming path. Since E is 
weakly acyclic, there are no cycles going through spe¬ 
cial edges; hence, rank{R,A) is finite. The rank of 
E, denoted rankiYi) is the maximum of rank[R, A) 


over all positions {R, A) in the dependency graph of 

E. 

If g is a query over the target schema T and I is 
a source instance, then the certain answers of q with 
respect to M are defined as 

certain(g, I,M) = 

P|{g(J) : J is a solution for I w.r.t. M} 

Definition 2.3. Let J be an instance which may 
contain null values, and let g be a conjunctive query 
over the schema of J. Then gj, (J) is defined as the 
answers of g on J that contain no null values. 

If J is a universal solution for a source instance I 
w.r.t. a schema mapping Ad, then for every conjunc¬ 
tive query q, it holds that certain(g,I, Ad) = qf (J). 

2.3 Repairs and Consistent Answers. 

Let E be a set of constraints over some relational 
schema. An inconsistent database is a database that 
violates at least one constraint in E. Informally, a 
repair of an inconsistent database / is a consistent 
database I' that differs from / in a “minimal” way. 
This notion can be formalized in several different 
ways |1]. 

1. A symmetric-difference-repair of I, denoted ©- 
repair of I, is an instance I' that satisfies E and 
where there is no instance I" such that /©/" C 
/©/' and I" satisfies E. Here, /©/' denotes the 
set of facts that form the symmetric difference of 
the instances I and 

2. A subset-repair of I is an instance I' that satisfies 
E and where there is no instance I" such that 
I' c I" C I and I" satisfies E. 

3. A superset-repair of I is an instance I' that satis¬ 
fies E and where there is no instance I" such that 
/' Z) I" D I and I" satisfies E. 

Clearly, subset-repair and superset-repairs are also 
©-repairs; however, a ©-repair need not be a subset- 
repair or a superset-repair. 



The consistent answers of a query q on I with re¬ 
spect to S are defined as: 

©-CQA((7,/, E) = 

P|{g(/') : /' is a ©-repair of I w.r.t. E} 

with subset and superset versions defined analo¬ 
gously. 

3 Framework and Related 
Work 

In this section, we introduce the exchange-repair 
framework, discuss its structural and algorithmic 
properties, and explore its relationship to incon¬ 
sistency tolerant semantics in data integration and 
ontology-based data access. 

3.1 The Exchange-Repair Framework 

Definition 3.1. Let At = (S,T, Est,Et) be a 
schema mapping, I a source instance, and {!', J') a 
pair of a source instance and a target instance. 

1. We say that is a symmetric-difference 

exchange-repair solution (in short, a ©-XR- 
solution) for I w.r.t. At if satisfies Ad, 

and there is no pair of instances such 

that /©/" C /©/' and (/", J") satisfies Ad. 

2. We say that (/', J') is a subset exchange-repair so¬ 

lution (in short, a subset-XR-solution) for I with 
respect to Ad if /' C / and satisfies AI; 

and there is no pair of instances (/", J") such that 
/' c /" C / and (/", J") satisfies Ad. 

Note that the minimality condition in the preceding 
definitions applies to the source instance but not 
to the target instance J' of the pair The 

source instance /' of a ©-XR-solution (subset-XR- 
solution) for / is called a source-repair (respec¬ 
tively, subset source-repair) of I. 

Figure [S] shows all two XR-solutions for our source 
instance and schema mapping. Notice that the 
shared origins of tuples are taken into account (for 
example, peter performs tasks only for his assigned 


department, unlike in Figure [3]), but the XR-solutions 
retain more derived target information than the in¬ 
stances in Figure 0] (by preferring to satisfy tgds by 
adding rather than deleting tuples). If we now eval¬ 
uate boss(peter, b) over each target instance, and 
take the intersection, we have {(peter, bobs)}, which 
aligns well with our intuitive expectations. A precise 
semantics for query answering is given later in this 
section. 

Source-repairs constitute a new notion that, in 
general, has different properties from those of the 
standard database repairs. Indeed, as mentioned 
earlier, a ©-repair need not be a subset repair. 
In contrast. Theorem 13.21 (below) asserts that the 
state of affairs is different for source-repairs. Recall 
that, according to the notation introduced earlier, 
GLAV+(wAGLAV, egd) denotes the collection of all 
schema mappings M = (S, T, Egt, Et) such that Egt 
is a finite set of s-t tgds and Et is the union of a finite 
weakly acyclic set of target tgds with a finite set of 
target egds. 

Lemma 3.1. Let M. be a GLAV+(glav, egd) 
schema mapping. If I' If I are two source instances, 
then every solution for I' w.r.t. A4 is also a solution 
for I w.r.t. A4, and consequently if I has no solution 
w.r.t. Ad then I' has no solution w.r.t. A4. 

Proof. Let I' f I he two source instances. We will 
show that if /' has a solution w.r.t. Ad then I also 
has a solution w.r.t. Ad. Let J be an arbitrary so¬ 
lution for r w.r.t. Ad. Let (()(x) —>■ 3y^/>(x, y) be an 
arbitrary tgd in Egt, and let : x —adom(/) be a 
homomorphism such that h{(j)fs.)) C J, and of course 
h{4>{-x.)) C r as well. Then h can be extended to 
some homomorphism h' such that /i'('0(x, y)) C J, 
and therefore (I, J) together satisfy Egt, and since J 
satisfies Et, we have that J is also a solution for I 
w.r.t. Ad. □ 

Theorem 3.2. Let M be a GLAV+(glav, egd) 
schema mapping. Let I be a source instance. Then 
if {I',J') is a ®-XR-solution of I w.r.t. M., then 
{I',J') is actually a subset-XR-solution of I w.r.t. 
Ad. Consequently, every source-repair of I is also 
a subset-source-repair of I. 
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Figure 6: Two XR-solutions for / w.r.t. A4. The Stakeholders tables are omitted for brevity. 


Proof. Let be an ©-XR-solution for I w.r.t. 

M. Suppose I' \ I 7 ^ 0. Then by Lemma [3Tl J' is 
also a solution for /' fl I. Since /'©/ D (/' fl /)©/, 
we have that (/', J') fails the minimality criterion and 
thus is not a ©-XR-solution for I w.r.t. Ad, which is 
a contradiction. □ 

Remark. From here on and in view of Theorem \3.?\ 
we will use the term XR-solution to mean subset- 
XR-solution; similarly, source-repair will mean sub¬ 
set source-repair. 

Note that if Ad is a glav+(waglav, egd) schema 
mapping, then source-repairs always exist. The rea¬ 
son is that, since the pair (0, 0) trivially satisfies Ad, 
then for every source instance I, there must exist a 
maximal subinstance I' of / for which a solution J' 
w.r.t. Ad exists; hence, is a source repair for 

I w.r.t. Ad. 

We now claim that the following statements are 
true for arbitrary source instances and schema map¬ 
pings. 

1. Repairs of the target instance obtained by chas¬ 
ing with the tgds of the schema mapping are not 
necessarily XR-solutions. 

2. Repairs of (/, 0) are not necessarily XR-solutions. 
For the first statement, consider the pair (/, J') from 
Figures [2] and m where J' is a ©-repair of the incon¬ 
sistent result J of the chase of I. Clearly, (/, J') is 
not an XR-solution, because J' is not a solution for I. 
For the second statement, consider the pairs (Ii, Ji), 
{h, J 2 ), {h, J 3 ) in FigureHl all of which are ©-repairs 


of (/, 0). The first two are also XR-solutions of I, but 
the third one is not. 

It can also be shown that XR-solutions are not nec¬ 
essarily ©-repairs of (L, 0). We now describe an im¬ 
portant case in which XR-solutions are ©-repairs of 
(1, 0). For this, we recall the notion of a core univer¬ 
sal solution from m- By definition, a core universal 
solution is a universal solution that has no homomor¬ 
phism to a proper subinstance. If a universal solu¬ 
tion exists, then a core universal solution also exists. 
Moreover, core universal solutions are unique up to 
isomorphism. 

Proposition 3.3. Let Ad fee a GLAV+(glav, egd) 
schema mapping. If I is source instance and {I',J') 
is an XR-solution for I w.r.t. Ad such that .J' is a 
core universal solution for I’ w.r.t. Ad, then 
is a ^-repair of (/, 0) w.r.t. Tj^t U 

Proof. Let be a pair of instances 

which together satisfy Egt U St and such that 
(/", J")©(/,0) C (/', J')®(^,0)- Since (/', J') is an 
XR-solution for I w.r.t. Ad, there is no instance I" 
such that /' C I" Q I and I" has a solution w.r.t. 
Ad. Therefore it must be that I" = I' . Furthermore, 
since J' is a core universal solution, there is no 
proper subinstance J" C J' that is a solution for 
r w.r.t. Ad, so J" = J'. Therefore (/', J') is a 
©-repair of (/, 0) w.r.t. Sgt U St. □ 

Next, we present the second key notion in the 
exchange-repair framework. 



































Definition 3.2. Let A4 = (S,T, SgtjSt) be a 
schema mapping and q a query over the target schema 
T. If / is a source instance, then the XR-certain an¬ 
swers of g on / w.r.t. Xi is the set 

XR-certain(g,/, Af) = 

is an XR-solution for /}. 

Note that when I has a solution w.r.t. M, it is its 
own only XR-solution. Thus the XR-certain seman¬ 
tics coincide with certain semantics when solutions 
exist. The next results provide a comparison of the 
XR-certain answers with the consistent answers. 

Proposition 3.4. Let Xi = (S, T, Ej) be a 

GLAV-|-(wAGLAV, egd) schema mapping and q a con¬ 
junctive query over the target schema T. If I 
is a source instance, then XR-certain(g,/, Af) 
©-CQA((jf, (7,0), Est U Et). Moreover, this contain¬ 
ment may be a proper one. 

Proof. Since X4 is weakly acyclic, for any in¬ 
stance I for which solutions exist, a core univer¬ 
sal solution also exists. Therefore, we have that 
XR-certain(( 7 ,7, A7) = n{9('^0 • is an XR- 

solution for 7 w.r.t. At, and J' is a core universal so¬ 
lution for 7' w.r.t. A4 }. By Proposition 13.31 the set 
of XR-solutions (7', J') where J' is a core universal 
solution for I' w.r.t. At is a subset (maybe proper) of 
the set of ©-repairs of (7,0) w.r.t. EgtUEt. Therefore 
XR-certain(( 7 ,7, Ad) A ©-CQA(g, (7,0), Egt U Ej). 

To see that this containment may be a 
proper one, consider the schema mapping Ad 
and query boss(peter, 6) in Figure [TJ and the 
repairs of (7,0) in Figure ID It is easy 
to verify that ©-CQA(boss(peter, 6), (7,0), Est U 
Et) = 0, while XR-certain(boss(peter, &), 7, Ad) = 
{(peter, bobs)}. □ 

The following proposition pertains to the case 
where Egt is the copy mapping, i.e. for each rela¬ 
tion R G S there is a corresponding relation R' of 
the same arity in T, and Egt contains only the tgd 
i?(x) —5> 7?'(x) for each R G S. We say an instance J 
is the copy of an instance 7 if J is the canonical uni¬ 
versal solution for 7 w.r.t. the copy mapping (so it 
contains the same facts up to renaming of relations). 


Proposition 3.5. Let Ad = (S,T, Est,Et) be a 
GAV+EGD schema mapping where list is the copy 
mapping, and let q be a conjunctive query over the 
target schema T. Then for every instance I, it holds 
that XR-certain((7,7, Ad) = sw&se<-CQA(g, J, E*), 
where J is the copy of I. 

Proof. Since Egt specifies the copy mapping and Et 
contains only egds, for every source repair I' there is 
an XR-solution(7', J') where J' is the copy of 7'. Fur¬ 
thermore, J' is a universal solution for 7' w.r.t. Ad, so 
we can write XR-certain(q, 7, Ad) = n{9('^0 I J') 
is an XR-solution for 7 w.r.t. Ad and J' is the copy of 
7'}. Therefore XR-certain(( 7 ,7, Ad) = I J' is 

the copy of a maximal subset of 7 such that J' \= Et}. 
Let J be the copy of 7. Then XR-certain(g, 7, Ad) = 
n{(z(</0 I is a maximal subset of J such that J' \= 
Et}, which is precisely subset-CQA(g, J, Et). □ 

Let Ad = (S, T, Est, Et) be a schema mapping and 
q a Boolean query over T. We consider two natural 
decision problems in the exchange-repair framework, 
and give upper bounds for their computational com¬ 
plexity. 

• Source-Repair Checking: Given a source in¬ 
stance 7 and a source instance I' C 7, is 7' a 
source-repair of 7 w.r.t. Ad? 

• XR-certain Query Answering: Given a source 
instance 7, does XR-certain(g, 7, Ad) evaluate to 
true? In other words, is q{J') true on every target 
instance J' for which there is a source instance I' 
such that (7', J') is an XR-solution for 7? 

Theorem 3.6. Let Xi be a GLAV+(waglav, egd) 
schema mapping. 

1. The source-repair checking problem is in PTIME. 

2. Let q be a union of conjunctive queries over the 
target schema. The XR-certain query answering 
problem for q is in coNP. 

Moreover, there is a schema mapping specified by 
copy s-t tgds and target egds, and a Boolean conjunc¬ 
tive query for which the XR-certain query answering 
problem is coNP-complete. Thus, the data complex¬ 
ity of the XR-certain answers for Boolean conjunctive 
queries is coNP-complete. 


Proof. For the first part, the following is a polynomial 
time algorithm to check if /' C J is a source repair of 
/ w.r.t. M: 

Use the chase procedure to check that /' has 
a solution w.r.t. Ai [H]. For every tuple 
t € I \ I', use the chase procedure to check 
that /' U {t} does not have a solution w.r.t. 

M. 

The first step ensures that /' has a solution, and by 
Lemma o the second step is sufficient to ensure 
that I' is a maximal such subset of I. Since M. is 
weakly acyclic, this algorithm runs in time which is 
polynomial in the size of I. 

For the second part, the following is an algorithm 
in NP to check if XR-certain(g,/, Ad) is false: 

Let I' be an arbitrary subset of I. Using the 
algorithm from the first part, check that I' 
is a source repair of /. If so, check that 
q{chase{I')) = false. 

For the matching lower bound, consider the schema 
mapping Ad = (S, T, Sst, St) and target con¬ 

junctive query g, where S = {P(a;, y), Q(a;, y)}, 
T = {P'(a;,y),Q'(a;,y)}, E^t = {P(a;,y) 

P'{x,y),Cl{x,y) Q'(x,y)}, and St = {P'(a;,y)A 

P'(a;,y') y = y',Q'ix,y) A Q'{x,y') y = 
y'}, and q = 3x3y3x'P'{x,y) A Q'{x',y). Note 
that Egt is the copy mapping, therefore, we have 
XR-certain(( 7 ,/, Ad) = ©-CQA(( 7 , J, St) where J is 
merely a copy of I. For the given target query and 
target constraints, the latter is known to be coNP- 
hard in data complexity [niET]. □ 

Theorem 13.61 implies that the algorithmic prop¬ 
erties of exchange-repairs are quite different from 
those of ©-repairs. Indeed, as shown in [TJ ITS] , 
for GLAV-|-(wAGLAV, egd) schema mappings, the ©- 
repair-checking problem is in coNP (and can be 
coNP-complete), and the data complexity of the con¬ 
sistent answers of Boolean conjunctive queries is II^- 
complete m- This drop in complexity can be di¬ 
rectly attributed to Theorem 13.21 


3.2 Related Work 

The work reported here builds directly on the work 
of many others, in particular the foundational work 
on database repairs and consistent query answering 
by Arenas, Bertossi, and Chomicki [1], and on data 
exchange and certain query answering by Fagin et 
al. [TO] . 

As mentioned earlier, the main conceptual con¬ 
tribution of this paper is the introduction of an 
inconsistency-tolerant semantics for data exchange, 
called exchange-repairs, in which we consider repairs 
to the source instance. Inconsistency-tolerant se¬ 
mantics have been studied in several different areas 
of database management, including data integration 
and ontology-based data access (OBDA). The com¬ 
mon motivation for inconsistency-tolerant semantics 
is to give non-trivial and, in fact, meaningful seman¬ 
tics to query answering. We now discuss the re¬ 
lationship between the XR-certain answers and the 
inconsistency-tolerant semantics of queries in these 
different contexts. 

3.3 Connections with data integration 

In [13] and |3H], the authors introduce and study 
the notion of loosely-sound semantics for queries in 
a data integration setting. There are two main dif¬ 
ferences between that setting and ours. To begin 
with, they consider schema mappings in which the 
schema mapping consists of GAV (global-as-view) 
constraints between the source (local) schema and 
the target (global) schema, and also of key constrains 
and inclusion dependencies on the target schema; 
in contrast, we consider richer constraint languages, 
namely, GLAV (global-and-local-as-view) constraints 
between source and target, and also target egds and 
target tgds. More importantly perhaps, the loosely- 
sound semantics are, in general, different from the 
XR-certain answers semantics. Specifically, given a 
source instance /, the loosely-sound semantics are 
obtained by first computing the result J of the chase 
of I with the GAV constraints between the source 
and the target, and then considering as “repairs” all 
instances J' that satisfy the target constraints and 
are inclusion maximal in their intersection with J. 


If all target constraints are egds (in particular, if 
all target constraints are key constraints), then it is 
easy to show that, for target conjunctive queries, the 
loosely-sound semantics coincide with the consistent 
answers of queries with respect to subset repairs of J. 
Thus, in this case, the loosely-sound semantics give 
the same unsatisfactory answers as the materialize- 
then-repair approach seen in Figure [31 Concretely, 
this approach yields the instance J' in Figure|3]as one 
possible “repair” of the instance J in Figure 1, and in¬ 
cludes the undesirable answers (peter, portman) and 
(peter, lumbergh) to the query boss(peter, &). Thus, 
this same example shows that the loosely-sound se¬ 
mantics are different from the XR-certain semantics. 

In [T3], Cali, Lembo, and Rosati consider the no¬ 
tions of loosely-sound, loosely-complete, and loosely- 
exact semantics of queries on an inconsistent 
database. We note that the loosely-exact semantics 
coincide with the consistent-answer semantics with 
respect to symmetric-difference-repairs of the incon¬ 
sistent database. 

3.4 Connections with ontology-based 
data access 

Ontology-based data access (OBDA), originally in¬ 
troduced in US], is a framework for answering queries 
over knowledge bases. In that framework, a knowl¬ 
edge base over a schema T is a pair (T>, E), where D is 
a T-instance and S is a set of constraints expressed in 
some logical formalism over T. The instance D rep¬ 
resents extensional knowledge given by the facts of 
D, and is called the ABox. The set E of constraints 
represents intensional knowledge, and is called the 
TBox. In most scenarios, the schema T consists of 
unary relation symbols, called concepts, and of bi¬ 
nary relation symbols, called roles. Moreover, E typ¬ 
ically consists of sentences in some description logic. 
An inconsistency-tolerant semantics in the context of 
OBDA was first investigated in [3D]; this semantics 
is based on the notion of AR-repairs (ABox-repairs) 
and has become known as AR-semantics. Subsequent 
investigations of AR-semantics were carried out in a 
number of papers, including (in chronological order) 
[29l (371 m [TT] lini [33]. These papers have analyzed 
the computational complexity of consistent query an¬ 


swering in OBDA and have also considered several 
variants of the AR-semantics in the OBDA frame¬ 
work. 

Data exchange and OBDA are different frameworks 
that aim to formalize different aspects of data inter¬ 
operability. n data exchange there are two schemas, 
the source schema and the target schema, with no 
restrictions on the type of relation symbols they con¬ 
tain, while in OBDA there is a single schema that typ¬ 
ically contains only unary and binary relation sym¬ 
bols. Moreover, as seen in the preceding discussion, 
the constraints typically used in data exchange are 
quite different from those typically used in OBDA. 
One notable exception to this is the work reported 
in [33], where the OBDA framework studied allows 
for tuple-generating dependencies (it also allows for 
negative constraints, but not for equality-generating 
dependencies). In spite of these differences, it turns 
out that there are close connections between data 
exchange and OBDA. In what follows, we spell out 
these connections in detail and show that, as regards 
consistent query answering, each of these two frame¬ 
works can simulate the other. 

We first introduce some basic concepts and termi¬ 
nology for OBDA; for the most part, we follow [33] . 

Let T be a schema and let {D, E) be a knowledge 
base over T. A model of {D, E) is a T-instance J 
such that D C J and J ^ E. We write mod(D, E) 
for the set of all models of {D, E). 

An AR-repair of {D, E) is a T-instance D' with the 
following properties: (i) D' C D; (ii) mod(D', E) ^ 
0; (iii) D' is an inclusion maximal sub-instance 
of D having the second property, i.e., there is no 
T-instance D" such that D' C D" C D and 
mod(D", E) ^ 0. We write drep(D, E) for the set 
of all AR-repairs of {D, E). 

Next, we introduce the notion of consistent query 
answering in the context of OBDA. Let g be a 
Boolean query over the schema T. We say that q 
is entailed by {D, E) under AR-semantics if for ev¬ 
ery AR-repair D' in drep(D, E) and every T-instance 
J G mod(I?', E), we have that J \= q. If g is a non- 
Boolean query of arity k over the schema T and a 
is a fc-tuple of constants, then we say that g(a) is 
entailed by (D, E) under AR-semantics if g(a) is en¬ 
tailed when viewed as a Boolean query; this means 


that for every AR-repair D' in drep(£), E) and ev¬ 
ery T-instance J € inod(L)', E), we have that a be¬ 
longs to the result g(J) of evaluating q on J. We 
write AR-certain(( 7 , D, E) to denote the set of all tu¬ 
ples a such that (/(a) is entailed by {D, E) under AR- 
semantics. By unraveling the definitions, we see that 

AR-certain(g, D, E) = 

J G mod(Z3',E) and D' G drep(-D, E)}. 

We are now ready to establish the precise connec¬ 
tions between the exchange-repairs framework and 
the OBDA framework. 

3.4.1 Prom OBDA to exchange repairs 

Assume that {D, E) is a knowledge base over a 
schema T. Let S* be the schema of the relation 
symbols occurring in D; note that S* is a (possi¬ 
bly proper) subschema of T. Let S be a copy of S*, 
that is, for every relation symbol R* in S*, there is a 
relation symbol i? in S of the same arity. If K is an 
S*-instance, we will write Ag to denote S-copy of A, 

i.e., the S-instance obtained from A by renaming the 
facts of A using the corresponding relation symbols 
in S. Conversely, if / is an S-instance, then we will 
write Is* to denote the S*-copy of /. 

The next proposition tells that the OBDA frame¬ 
work can be simulated by the exchange-repairs frame¬ 
work. The proof is straightforward, and it is omitted. 

Proposition 3.7. Consider the schema mapping 
Ai — (S,T,Est, E), where Sgt is the set of copy s- 
t tgds from S to S*. The following statements are 
true. 

1. For every T-instance D' C D and every T- 
instance J, we have that D' C J and J |= E z/ 
and only if J is a solution for Dg w.r.t. A4. 

2. For every S-instance I' C Dg and every T- 
instance J, we have that J is a solution for I' 
w.r.t. A4 if and only if C J and J ^ E. 

3. There is a 1-1 correspondence between the AR- 
repairs of {D, E) and the subset source repairs of 
Ds w.r.t. A4. In fact, they are the same up to 
renaming relation symbols in S* by their copies 

in S. 


J^. For every query q over T, we have that 
AR-certain(g, D, E) = XR-certain(g, Ds, M). 

3.4.2 Prom exchange repairs to OBDA 

Assume that M = (S,T, Est,Et) is a schema map¬ 
pings in which Egt is a set of s-t tgds and Et is a 
set of arbitrary constraints over T. Recall that the 
schemas S and T have no relation symbols in com¬ 
mon. The next proposition tells that the exchange- 
repairs framework can be simulated by the OBDA 
framework. Since Ej are arbitrary constraints. The¬ 
orem 13.21 does not necessarily apply, so we explicitly 
focus on subset source repairs. 

Proposition 3.8. Let I be a source instance. Con¬ 
sider the knowledge base (/, Ejt U E*) with I as the 
ABox and the union Fist U Et over the schema S U T 
as the TBox. The following statements are true. 

1. For every source instance we have that /' is a 
subset source repair of I w.r.t. A4 if and only if 
I' is an AR-repair of (/, E^t U St). 

2. For every query q over T, we have that 
XR-certain(q, I,M) = AR-certain(q, I, Sgt U St). 

Proof. Assume first that /' is a subset source repair of 
I w.r.t. Ai. We have to show that /' is an AR-repair 
of (/, Est U Et). Since F is a subset source repair of I 
w.r.t. Ad, we have that /' C I. Moreover, there is a 
solution J for I' w.r.t. Ad. Let J' = I' U J. Clearly, 
we have that (i) F C J' and (ii) J' ^ Egt U Et, hence 
mod(/', Est U Et) yf 0. It remains to show that F 
is a maximal sub-instance of / with the preceding 
properties (i) and (ii). Towards a contradiction, sup¬ 
pose that there is a sub-instance I" of I such that 
F C /" C / and mod(/", Egt U Et) yf 0. Consider an 
instance J" G mod(/", Egt U Et). Then I" C J" and 
J" \= Est U Et. Let T”|-p be the restriction of J" to 
the target schema T, that is, is the sub-instance 

of J" consisting of the facts of J" that involve rela¬ 
tion symbols in T only. We claim that J"\rp is a 
solution for I" w.r.t. Ai. Indeed, J"\r^ H since 
all formulas in Et contain atomic formulas from T 
only. Moreover, since I” C J" and since J” |= Egt, 
we have (/", J"|.p) |= Est- This is so because, since 
Est consists of s-t tgds, the S-facts in J" \ I" play no 


role in satisfying Egt; we note that this may not hold 
if, say, Est contained target-to-source tgds. It follows 
that /' is not a subset source repair for I w.r.t. Ai, 
which is a contradiction. 

Next, assume that /' is an AR-repair of (/, Egt U 
Et). We have to show that I' is a subset source repair 
of I w.r.t. Ai. Since /' is an AR-repair of (/, EstUEt), 
we have that I' Q I and mod(/', Egt U Et) 7 ^ 0. Let 
J' be a member of mod(/', Egt U Et). Hence, I' C J' 
and J' ^ Est U Et. If J'|.p is the restriction of J' to 
the target schema T, then \= Egt (because 

Est consists of s-t tgds) and J'\^ ^ Et. Thus, there 
is a solution for I' w.r.t. Ai. Moreover, we claim 
that I' is a maximal sub-instance of / for which there 
exists a solution w.r.t. Ai. Indeed, if I" is such that 
/' c I" C I and a solution J" for I" w.r.t. Ai exists, 
then I” U J” [= Est U Et. It follows that /' is an 
AR-repair of (/, Est U Et), which is a contradiction. 

Finally, if 9 is a query over T, then, using the 
first part of the proposition, it is easy to show that 
XR-certain(( 7 ,/, AI) = AR-certain(g,/, Est UEt). □ 

4 CQA-Rewritability 

In this section, we show that, for GAV-|-egd schema 
mappings AI = (S, T, Est, Et), it is possible to con¬ 
struct a set of egds Eg over S such that an S- 
instance I is consistent with Eg if and only if / has 
a solution w.r.t. AI. We use this to show that 
XR-certain(( 7 ,/, AI) for a conjunctive query q coin¬ 
cides with subset-CQA(( 7 s,/, Eg) for a union of con¬ 
junctive queries Qg. Thus, we can employ tools for 
consistent query answering with respect to egds in 
order to compute XR-certain answers for GAV-|-egd 
schema mappings. 

We will use the well-known technique of GAV un¬ 
folding (see, e.g., [2I])- Let AI = (S,T, Est,Et) be 
a GAV-I-EGD schema mapping. For each k-aiy target 
relation T e T, let gx be the set of all conjunctive 
queries q{xi, ...,Xk) = 3y(())(y)Axi = yt^A-■ -Axk = 
Vik), for (j){y) T{yi^,. ..,yi^) a GAV tgd belonging 

to Egt (recall that we frequently omit universal quan¬ 
tifiers in our notation, for the sake of readability). 

A GAV unfolding of a conjunctive query g(z) over 
T w.r.t. Egt is a conjunctive query over S obtained 


by replacing each occurrence of a target atom T(z') 
in g(z) with one of the conjunctive queries in q^ (sub¬ 
stituting variables from z' for xi,... ,Xk, and pulling 
existential quantifiers out to the front of the formula). 

Similarly, we define a GAV unfolding of an egd 
(/)(x) Xk = xi over T w.r.t. Egt to be an egd over 
S obtained by replacing each occurrence of a target 
atom T(z') in (/)(x) by one of the conjunctive queries 
in gx (substituting variables from z' for xi,...,Xk, 
and pulling existential quantifiers out to the front of 
the formula as needed, where they become universal 
quantifiers). 

Figure 0 shows the GAV unfolding of the schema 
mapping and query from Figure [TJ 

Theorem 4.1. Let Ai = (S,T, Egt,Et) be a 

GAV-I-EGD schema mapping, and let Eg he the set of 
all GAV unfoldings of egds in Et w.r.t. Egt. Let L be 
an S-instance. The the following are equivalent: 

1. I satisfies Eg if and only if I has a solution w.r.t. 
Ai. 

2. The subset-repairs of L w.r.t. Eg are the source 
repairs of I w.r.t. Ai. 

3. For each conjunctive query q over T, we have that 
XR-certain(g,/, AI) = SMfeseI-CQA(gg,/, Eg), 
where gg is the union of GAV-unfoldings of q 
w.r.t. Egt. 

Proof. I. Let / be an S-instance which does not sat¬ 
isfy Eg. Then there is an egd (()i(x) A... A(()fe(x) —7 
Xi = Xj € Eg which is violated in / by some im¬ 
age (jiia) A ... A (j)k{a). By the definition of Eg, 
there is an egd ri(x) A ... A Tfc(x) Xi = Xj in 
Et, and tgds (fiix.) ri(x),..., ^^(x) -7 Tfc(x) 
in Egt. Then for any instance J where (/, J) to¬ 
gether satisfy Egt, it holds that J contains the 
image T'i(a) A ... A Tk{a) and therefore violates 
Et. The proof of the converse is similar. 

2. Consider that the source repairs are the maximal 
subsets of / for which solutions exist. Using the 
above, we have that these are also the maximal 
subsets of L which satisfy Eg, and therefore they 
are also the subset repairs of L w.r.t. Eg. 

3. By definition XR-certain(g, I, AI) is the intersec¬ 
tion over g( J') for all XR-solutions (/', J') w.r.t. 


Ss = { Task_Assignments(p, t, d) A Task_Assignments(p, t', d') ^ d = d' } 

hosssiperson, stakeholder) = 3task, department{ 

Task_Assignments(person, task, department) A Stakeholders_old(tosfc, stakeholder)) 


Figure 7: The GAV Unfolding of the schema mapping and query given in Figure [TJ 


M (or in other words, for all source repairs /' 
and solutions J' for /' w.r.t. M). Observe that 
this is the intersection of certain(g,A^) over 
all source repairs I' w.r.t. M. We will now show 
that certain(q,Ad) = Qs)!')' 

Let J' be the solution for I' w.r.t. 

Ad. Suppose a is a tuple in 
q{J')- Then there is some image 
Ti(a, b) A ... A T/c(a, b) of q in J', 
and there are some tgds (x, y) —>■ 
7’i(x,y Tk{x,y) in Sst 
where the image (a, b) A... A (a, b) 
is in I'. By definition the clause 
3y(()i(x,y) A ... A(()fe(x,y) is in q^, so a 
is in qs{I')- The proof of the converse 
is similar. 

We now have that XR-certain(( 7 ,/, Ad) is the in¬ 
tersection over qs{I') for all source repairs /' of 
I w.r.t. A4. By the second item of the theo¬ 
rem, this gives the intersection over qsil') for all 
subset repairs I' of I w.r.t. Eg, which is simply 
subset-CQA(qs, /, Eg). 

□ 

The following result tells us that Theorem 14.11 can- 
not be extended to schema mappings containing LAV 
s-t tgds. 

Theorem 4.2. Consider the lav-|-egd schema map¬ 
ping Ad = (S, T, Egt, Et), where 

• S = {i?} and T = {T}, 

• Egt = {R{x, y) —>■ 3u T{x, u) A Tijj, u)}, and 

• Et = {T{x, y) AT{x,z) ^ y = z}. 

Consider the query q{x,y) = 3z. T(x,z) AT{y,z) 
over T. There does not exist a UCQ qs over S and 
a set of universal first-order sentences (in particular, 


egds) Eg such that, for every instance I, we have that 
XR-certain( 9 ,/, Ad) = subset-CQ,A{qg, I,T,s). 

It is worth noting that the schema mapping 
Ad in the statement of Theorem 14.21 is such that 
every source instance has a solution, and hence 
“XR-certain” could be replaced by “certain” in the 
statement. 

Proof. We start by observing that certain)^, J, Ad) 
expresses undirected reachability along the relation 
R: 

Claim: For every S-instance I, certain(( 7 ,/, Ad) = 
{(a, 5) G adom{I) \ b is reachable from a by an undi¬ 
rected i?-path}. 

The left-to-right inclusion can be proved by induc¬ 
tion on the length of the shortest undirected path 
from a to b, while, for the right-to-left inclusion, it 
is enough to consider the solution J that contains a 
null value for each connected component of I, and 
such that J contains all facts of the form T{a, N) for 
a S adom{I), where N is the null value associated to 
the connected component of / to which a belongs. 

Now, suppose for the sake of a contradiction that 
qs and Eg as described in the statement of the propo¬ 
sition exist. Let k be the number of variables in 
qs- Let I be an instance that consists of a directed 
path of length A: -|- 1 from a to b. It follows from 
the above claim, and from our assumption on qs 
and Eg, that {a,b) G subset-CQA(qg,/, Eg), and 
for every proper subinstance I' of I, we have that 
(a, b) ^ certain(( 7 , 1', M). 

Claim: The instance I is consistent with Eg. 

Suppose for the sake of a contradiction that the 
above claim does not hold. Let /' be any subset- 
repair of I with respect to Eg. Since I' is a 



proper sub-instance of /, we have that (a, b) ^ 
certain(g,S). In particular, since I' satisfies 
we have that (a, 6) ^ qs{I')- But since /' is a repair 
of /, this means that (a, &) ^ subset-CQA(gs,/, S^), 
a contradiction. 

Since (a, 6) G subset-CQA(gs,/, E^) and I is con¬ 
sistent with Es we have that (a, &) G Qs{I)- That 
is, there is a homomorphism h from qs to I. Let I" 
be the sub-instance of I consisting of the facts in¬ 
volving only values that are in the image of h. Since 
/ contains k facts and q contains fc -I- 1 facts, I" is 
a proper sub-instance of I. Moreover, since univer¬ 
sal first-order sentences are preserved under taking 
induced sub-instances, every egd true in I is also 
true in I" and therefore, I" is consistent with E^. 
Finally, by construction, qs{I") = true. Therefore, 
(a, 6) G subset-CQA(( 7 s,Eg). This contradicts the 
fact that (a, 6) ^ certain((jf,M). 

□ 

The following result tells us that Theorem 14.11 also 
cannot be extended to schema mappings containing 
GAV target tgds. 

Theorem 4.3. Consider the GAV-|-(gav, egd) 
schema mapping Ai = (S, T, Ejt, Et), where 

• S = {i?} and T = {T}, 

• Est = {R{x,y) T{x,y)}, and 

• Et = {T{x, y) A T(y, z) T{x, z)]. 

Consider the query q{x,y) = T{x,y) over T. There 
does not exist a UCQ qs over S and a set of 
universal first-order sentences (in particular, egds) 
Eg such that, for every instance I, we have that 
XR-certain(( 7 ,/, A4) = subset-CQA{qs, I,^s)- 

Proof. We start by observing that certain(( 7 ,/, A4) 
expresses directed reachability along the 
relation R: for every S-instance I, 

certain)^,/, A^) = {(a, G adom{I) \ 

b is reachable from a by a directed i?-path}. The 
claim is proved by induction on the length of the 
path. The remainder of the proof is identical to 
that of Theorem 14.21 (the difference between directed 
paths and undirected paths is inessential to the 
argument). 

□ 


5 DLP-Rewritability 

We saw in the previous section that the applica¬ 
bility of the CQA-rewriting approach is limited to 
GAV-I-EGD schema mappings. In this section, we con¬ 
sider another approach to computing XR-certain an¬ 
swers, based on a reduction to the problem of com¬ 
puting certain answers over the stable models of a dis¬ 
junctive logic program. Our reduction is applicable 
to GLAV-|-(wAGLAV, egd) schema mappings. First, 
we reduce the case of glav-|-(waglav, egd) schema 
mappings to the case of GAV-|-(gav, egd) schema 
mappings. 

Theorem 5.1. From a GLAV-|-(waglav, egd) 
schema mapping J\4 we can construct a 
GAV-|-(gav, egd) schema mapping A4 such 
that, from a conjunctive query q, we can con¬ 
struct a union of conjunctive queries q with 
XR-certain( 9 , 1,M.) = XR-certain((), I,M). 

The proof of Theorem 15.11 is given in Section [5] (it 
is entailed by Theorem l6.2|) . Theorem 031 shows that 
the CQA-rewriting approach studied in Section |4] 
is, in general, not applicable to GAV-|-(gav, egd) 
schema mappings and unions of conjunctive queries. 
To address this problem, we will now consider a dif¬ 
ferent approach to computing XR-certain answers, 
using disjunctive logic programs. Although stable 
models are popular in the literature, including for 
database repairs, we find that the selective minimiza¬ 
tion offered by parallel circumscription is a better fit 
for XR-certain semantics because our minimality con¬ 
dition applies only to the source-part of the schema. 
We then use a result from [25] to translate back into 
the realm of stable models. 

Stable models of disjunctive logic programs have 
been well-studied as a way to compute database re¬ 
pairs ([34] provides a thorough treatment). In [14] . 
Cali et al. give an encoding of their loosely-sound 
semantics for data integration as a disjunctive logic 
program. Their encoding is applicable for non-key¬ 
conflicting sets of constraints, a syntactic condition 
that is orthogonal to weak acyclicity, and which elim¬ 
inates the utility of named nulls. Although their se¬ 
mantics use a notion of minimality that is similar to 


ours, our setting and our syntactic condition differ 
sufficiently that our results are complementary. 

Fix a domain Const. A disjunctive logic program 
(DLP) n over a schema R is a finite collection of rules 
of the form 

ai V... Va„ 

where n,m,k > 0 and 

ai,... ... ,/3m,7i) ■ • ■ ) 7 /c are atoms formed 

from the relations in R U {=}, using the constants 
in Const and first-order variables. A DLP is said to 
be positive if it consists of rules that do not contain 
negated atoms except possibly for inequalities. A 
DLP is said to be ground if it consists of rules 
that do not contain any first-order variables. A 
model of 11 is an R-instance I over domain Const 
that satisfies all rules of 11 (viewed as universally 
quantified first-order sentences). A rule in which 
n = 0 is called a constraint, and is satisfied only if 
its body is not satisfied. A minimal model of 11 is 
a model M of If such that there does not exist a 
model M' of If where the facts of M' form a strict 
subset of the facts of M. More generally, for subsets 
R-mjR-f ^ R-, an {'Rm,'R-f)- minimal model of 11 is 
a model M of If such that there does not exist a 
model M' of 11 where the facts of M' involving 
relations from Rm form a strict subset of the facts 
of M involving relations from Rm, and the set of 
facts of M' involving relations from Rp is equal to 
the set of facts of M involving relations from Rp 
|25) . Although minimal models are a well-behaved 
semantics for positive DLPs, it is not well suited 
for programs with negations. The stable model 
semantics is a widely used semantics of DLPs that 
are not necessarily positive. For positive DLPs, it 
coincides with the minimal model semantics. For a 
ground DLP If over a schema R and an R-instance 
M over the domain Const, the reduct 11^ of If with 
respect to M is the DLP containing, for each rule 
aiV.. .Va„ /3i Pm 1 -• 71 ,..., - 17 / 0 , with M ^ 7 i 
for all i < k, the rule ai V ... V a„ /3i,..., (3m- A 
stable model of a ground DLP 11 is an R-instance 
M over the domain Const such that M is a minimal 
model of the reduct 11^. See [22] for more details. 

In this section, we will construct positive DLP pro¬ 
grams whose (Rm, RF)-minimal models correspond to 


XR-solutions. In light of Theorem 15.11 we may re¬ 
strict our attention to GAV-|-(gav, egd) schema map¬ 
pings. 

In [2S] it was shown that a positive ground DLP 11 
over a schema R, together with subsets Rm, Rf C R, 
can be translated in polynomial time to a (not neces¬ 
sarily positive) DLP 11' over a possibly larger schema 
that includes R, such that there is a bijection be¬ 
tween the (Rm, Rp)-minimal models of 11 and the 
stable models of 11', where every pair of instances 
that stand in the bijection agree on all facts over the 
schema R. This shows that DLP reasoners based on 
the stable model semantics, such as DLV [32112] , can 
be used to evaluate positive ground disjunctive logic 
programs under the (Rm, Rp)-minimal model seman¬ 
tics. Although stated only for ground programs in 
[25] , this technique can be used for arbitrary positive 
DLPs through grounding. Note that, when a pro¬ 
gram is grounded, inequalities are reduced to T or 
T. 

Theorem 5.2. Given a GAV-|-(gav, egd) schema 
mapping A4 = (S, T, S^t, Ej), we can construct in 
linear time a positive DLP 11 over a schema R 
that contains S U T, and subsets Rm, Rf C R, 
such that for every union q of conjunctive queries 
over T and for every S-instance I, we have that 
XR-certain(g,/, Ad) = f^{q{M) \ M is an (Rm,Rf)- 
minimal model ofHUl}. 

Proof. We construct a disjunctive logic program 
nxRc(Ad) for a GAV-|-(gav, egd) schema mapping 
M = (S, T, Est, Et) as follows: 

1. For each source relation S with arity n, add the 
rules 

Sk{xi ,... ,x„) V Sd{xi ,... ,x„) S'(a:i,... ,x„) 

T ^ (xi, ... , Xji), Set (.Xl, ... , Xri) 

S{xi, . . . ,X„) Sk{xi,. ..,Xn) 

where Sk and Sd represent the kept and deleted 
atoms of S, respectively. 

2. For each s-t tgd (/)(x) —T(x') in Egt, add the 
rule 

T(x ) i CXi, . . . , O-m 

where ai,..., am are the atoms in in which 
each relation S has been uniformly replaced by 

Sk- 


3. For each tgd (/)(x) —i> r(x') in Et, add the rule 

r(x) 

where ai,..., am are the atoms in (^(x). 

4. For each egd (/)(x) xi = X 2 , where xi,X 2 € x, 
add the rule 

d- ^ ai^ • ■ ■ : am-: ^2: 

where ai,..., am are the atoms in (?!)(x). 

We minimize the model w.r.t. Rm = {Sd \ S € S}, 
and fixRp = {S \ S £ S}. The disjunctive logic 
program for M, denoted nxRc(At), is a straightfor¬ 
ward encoding of the constraints in Egt and Et as 
disjunctive logic rules over an indefinite view of the 
source instance. Since the source instance is fixed, the 
rules of the form ... ,Xn) t— Skixi,... ,Xn) in 

nxRc(A4) force the kept atoms to be a sub-instance of 
the source instance. Notice that egds are encoded as 
denial constraints, and that disjunction is used only 
to non-deterministically choose a subset of the source 
instance. 

To prove the theorem, we first show that the 
restriction of every (Rm, RF)-minimal model of 
nxRc(A4) U / to the schema {S'fc|S'eS}UT consti¬ 
tutes an exchange-repair solution. We then show that 
for every exchange-repair solution, we can build a cor¬ 
responding (Rm, RF)-minimal model of nxRc(A4) U I. 

Let M = (S.T, Est,Et) be a gav-|-(gav, egd) 
schema mapping. Let If = nxRc(A4). Let R be 
the schema of If, and let Rm = {5'd | S £ S}, and 
Rf = {S' I S € S}. Let q be a union of conjunctive 
queries over T, and let I be an S-instance. 

We first prove that a certain restriction of 
every (Rm, RF)-minimal model of 11 U / is 
an exchange-rep air solution. Let M be an 
(Rm, RF)-niinimal model of 11 U /. Then for 
each source atom S(ci,..., c„), M satisfies 
Sfc(ci,...,c„) V Sd(ci,...,c„) ■<- S(ci,...,c„) and 
■<— Sk{ci,... ,Cn), Sd{ci,... ,Cn) and therefore con¬ 
tains exactly one of Sfe(ci,..., c„) or Sd{ci ,..., c„). 
Furthermore, for every atom Sk(di,... ,dn) £ M, 
M satisfies S(di,...,d„) ^ Sk{di,... ,dn)- Let /' 
be a renaming of the restriction of M to the kept 
predicates (by removal of the k subscript), and 


observe that since I is fixed, I' is a sub-instance of I. 
Furthermore, since M ^ IIU/ (which contains copies 
of the constraints of M over its kept predicates), I' 
has a solution w.r.t. A4. Finally, an appropriate 
renaming (by removal of the d subscript) of the 
restriction of M to Rm (the deleted predicates) is 
equal to / \ and since M is a (Rm, RF)-niinimal 
model of n U /, we have that there is no I" such 
that /' £ I" £ I and a solution exists for I" w.r.t. 
A4. Therefore, I' is a source repair of I w.r.t. A4, 
and I' along with the restriction of M to T is an 
exchange-repair solution for I w.r.t. A4. 

We now prove that for every exchange-repair solu¬ 
tion, there exists an (Rm, RF)-minimal model of LIU/. 
Let (/', J') be an exchange-repair solution for / w.r.t 
M. Let M = / U /{ U (/ \ I')d U J', where /{ is /' 
renamed over the kept predicates, and {I\I')d is I\I' 
renamed over the deleted predicates. Since I' is a sub¬ 
set of /, and I\I' is disjoint from /', we have that the 
rules of the forms Sk{xi, ..., x-n) V Sd{x\, ..., x„) •(— 
S'(xi, ... ,x„), and •(- S'fe(xi,...,x„), ^^(xi,...,x„), 
and ^(xi,... ,x„) •(— Sk{xi,... ,x„) are satisfied. It 
also holds that (/', J') satisfy M, and therefore M is 
a model of 11 U /. Finally, since there is no I" such 
that /' £ I" £ I and a solution exists for I" w.r.t. 
A4, M is also (Rm, RF)-minimal. 

Therefore, XR-certain(q,/, A4) = (\{q{M) \ M is 
an (Rm, RF)-minimal model of LI U /}. 

□ 

Figure [8] illustrates the disjunctive logic program 
obtained for the schema mapping from Figure [TJ 

6 From GLAV+(WAGLAV, egd) to 
gav+(gav, egd) 

In this section, we prove Theorem 15.II and dis¬ 
cuss some additional literature related to this par¬ 
ticular result. Let A4i and AI 2 be schema map¬ 
pings with the same source schema. We will write 
■Ml ~^uCQ ■M 2 if for every UCQ g over the target 
schema of Ali, there is a UCQ q' over the target 
schema of A42 such that for all source instance I, 
XR-certain(q,/, All) = XR-certain(q',/, A42)- Us¬ 
ing this notation. Theorem 15.11 states that for ev- 


Task_Assignments^(p, d) V Task_Assignments^(p, t, d) <— Task_Assignments(p, t, d). 

_L -f- Task_Assignments^(p, t, d), Task_Assignments^(p, d). 

Task_Assignments(p, d) -f- Task_AssignmentSj,(j>, d). 

Stakeholders_oldfc(t, s) V Stakeholders_oldd(t, s) <— Stakeholders_old(t, s). 

_L Stakeholders_oldfc(i, s) A Stakeholders_oldd(t, s). 

Stakeholders_old(t, s) ■<— Stakeholders_oldfc(t, s). 

Departments(p, d) •<— Task_Assignments^(p, t, d). 

Tasks(p, t) •<— Task_Assignments^(p, <, d). 

Stakeholdersjiew(i, s) ■<— Stakeholders_oldfc(t, s). 

_L -f- Department s(p, d), Department s(p, d'), d 7 ^ d'. 

boss(person, stakeholder) •«— Tasks(person, task), Stakeholders_new(tasfc, stakeholder) 


Figure 8 : The disjunctive logic program over (Rm, Rf)- minimal models for the schema mapping 
and query given in Figure [TJ Here, Rm = {Task_Assignments^, Stakeholders.oldc;} and 
Rf = {Task_Assignments, Stakeholders_old} 


ery GLAV+(waglav, egd) schema mapping A4 there 
is a GAV+(gav, egd) schema mapping Ad' with 
A4 ~^uCQ ■M'. We will in fact prove a stronger 
statement that applies to schema mappings defined 
by second-order tgds. Second-order tgds serve not 
only to strengthen the result, but also to make its 
proof more natural. 

6.1 Second-order TGDs 

Second-order tgds are a natural extension of tgds that 
was introduced in | 20 ] in the context of schema map¬ 
ping composition. We recall the definition. 

Let f be a collection of function symbols, each hav¬ 
ing a designated arity. A simple term is a constant 
or variable. A compound term is a function applied 
to a list of terms, such that the arity of the function 
symbol is respected. By an i-term, we mean either a 
simple term, or a compound term built up from vari¬ 
ables and/or constants using the function symbols in 
f . We will omit f from the notation when it is under¬ 
stood from context. The depth of a term is the max¬ 
imal nesting of function symbols, with depth{e) = 0 
when e is a simple term. A ground term is a term in 
which no variables appear. 


A second-order tgd (SO tgd) over a schema R is 
an expression of the form 

CT = 3f (Vxi ((/)1 ^ 1 ) A • • • A Vx„((/)„ V'n)) 

where f is a collection of function symbols, and 

1. each (j)i is a conjunction of (a) atoms S(yi ,..., yk) 
where S' € R and yi,... ,yk are variables from Xj; 
and (b) equalities of the form ti = ^2 where H, ^2 
are terms over Xj and f. 

2 . each ipi is a conjunction of atoms S{ti,...,tk) 
where S € R and ti,... ,tk are f-terms built from 

Xi. 

3. each variable in Xi occurs in a relational atom in 
4^i • 

We say that an R-instance I satisfies a if there exists 
a collection of functions (whose domain and range 
are Const U Nulls) such that each “clause” Vxi((()i —> 
ipi) of tr is satisfied in I where each function symbol 
in f is interpreted by the corresponding function in 
f°. We will write I \= a when this is the case, or, 
if we wish to make f° explicit in the notation, I \= 
CT [f i-A f°]. 

A source-to-target SO tgd for source schema S and 
target schema T is an SO tgd over SUT, of the above 



form, where each (f)i contains only relation symbols 
from S and each ipi contains only relation symbols 
from T. We note that, in [^, only source-to-target 
SO tgds were considered. 

An equality-free SO tgd (efsotgd) is an SO tgd 
that does not contain term equalities. We denote by 
SOTGD+(sOTGD, egd) the class of schema mappings 
M = (S,T, EgtjSt) where Egt is a set of source- 
to-target SO tgds over S and T, and Et is a set 
of SO tgds and/or egds over T. Other classes of 
schema mappings, such as S0TGD-|-S0TGD and ef- 
SOTGD-I-EFSOTGD, are defined analogously. Note 
that an important subclass of equality-free SO tgds 
are the plain SO tgds, introduced in [^, in which no 
terms contain nested functions. 

It is known that every tgd is logically equivalent to 
a SO tgd, which can be obtained from it by skolem- 
ization [ 20 ]. Although stated in the literature only 
for the case of source-to-target tgds [20], the same 
applies to target tgds. Figure [TO] shows the skolem- 
ization of the example schema mapping in Figure |9] 

Moreover, if we adapt the concept of weak acyclic¬ 
ity to SO tgds in the appropriate way, then every 
weakly acyclic set of tgds is logically equivalent to a 
weakly acyclic SO tgd. 

More precisely, we say that a set E of SO tgds is 
weakly acyclic if there is no cycle in its dependency 
graph containing a special edge, where the depen¬ 
dency graph associated to a set of SO tgds is defined 
as follows: 

1. the directed graph whose nodes are positions 
{R, i) where i? is a relation symbol and i is an 
attribute position of R (as before) 

2. there is a normal edge from {R,i) to (S',/) if E 
contains a SO tgd of the form 

cr = 3f (Vxi((/i A-- - A Vx„((/„ ifn)) 

and for some i < n, cfi contains a variable in 
position (S, i) and tpi contains the same variable 
in position (S, j). 

3. there is a special edge from {R,i) to (S,/) if E 
contains a SO tgd of the form 

a = 3f (Vxi((/)i -A ipi) A-- - A Vx„((/)„ ipn)) 


s = 

T = 

Sst = 

Et = 
q(a^,2/) 

[R} 

[Tj 

^{x,y) -A 3u{T{x,u) AT{y,u)) } 
T{x,y) AT{x,y') -A y = y' ) 

= 3u{T{x,u) A T{y,u)) 

Figure 9: 

An example schema mapping and query 

S =] 
T =] 

Est = 

Et = 

R} 

T} 

' f R{x,y) -A T{x,f{x,y))A\\ 

, ■'VR(a;,y) -)> T{y,f{x,y)) Jj 

T(x,y) AT(x,y') -A y = y' } 

= 3u{T{x,u) A T{y,u)) 


Figure 10: Result of skolemizing the schema mapping 
in Figure [2] 

and for some i < n, (fi contains a variable in 
position (i?, i) and ifi contains a compound term 
in position (S, j) containing the same variable. 
We then have: 

Proposition 6.1. Every GLAV-|-(waglav, egd) 
schema mapping is logically equivalent to a weakly 
acyclic efsotgd-|-(efsotgd, egd) schema map¬ 
ping. 

Indeed, if AI is a GLAV-|-(waglav, egd) schema 
mapping, and At' is the efsotgd-|-(efsotgd, egd) 
schema mapping obtained from At by skolemization, 
then At and AI' are logically equivalent. Moreover, 
it is easy to see that AI and Al' have the same depen¬ 
dency graph, and, therefore. Ad' is weakly acyclic. 

In the remainder of this section, we will establish: 

Theorem 6.2. For every weakly acyclic 
SOTGD-|-(sOTGD, egd) schema mapping Ad there is 
a GAV-|-(gav, egd) schema mapping M' such that 
Ad -^uCQ Ad'. 

The proof borrows ideas from previous literature, 
and we discuss relevant related work at the end of 
the section. 




6.2 Eliminating Equalities to Estab¬ 
lish Freeness 

In this section, we will rewrite our schema mapping 
to eliminate egds as well as equality conditions in 
SO tgds. This allows us to work with solutions in 
which there is a one-to-one correspondence between 
ground terms (of any depth) and their values. This 
property, called freeness, which we define below, is 
used in Section 1^31 

For simplicity we first restrict attention to ef- 
SOTGD-|-(sOTGD, egd) schema mappings. 

Definition 6.1 (Equality Singularization). Fix a 
fresh binary relation symbol Eq. 

• The equality singularization of a conjunctive query 
q{x) = 3y(/)(x,y), denoted by q^'^(x), is the con¬ 
junctive query 3yz^'(x, y,z) obtained from q as 
follows: whenever a variable u (free or quantified) 
occurs more than once in (p, we replace each oc¬ 
curence other than the first occurrence by a fresh 
distinct variable z and we add the atom Eq{u, z). 

• The equality singularization of an egd 

a = Vx(0 Xi = Xj ) 

is the GAV tgd 

^Eq _ yxz((/)' —>• Eq(xi,Xj)) 
where = 3z^' 

• The equality singularization of an SO tgd 

cr = 3f l\ (VXi((/)i ipi)) 

(where each (pi is a conjunction of relational atoms 
and each is a conjunction of equalities) is the 
equality-free SO tgd 

cr' = 3f (Vx,Zi(())'i Aa'i^ipi)) 

where = 3zi0' and a' is obtained from ai by 
replacing each equality s = f by Eq{s, t). 

• The equality singularization of a ef- 
SOTGD-|-(sOTGD, egd) schema mapping 



Figure 11: Equality singularization of the schema 
mapping and query from Figure [TUI 


M = (S,T, Est,Et) is the efsotgd-|-efsotgd 
schema mapping 

= (S, TU{Eq}, E,t, \ a G Et}UeqAx(T)) 

where eqAx(T) is the set of (full) tgds of the 
form T(a:i,... ,x„) Eq(a;i,xi)A- • ■ AE(\{xn,Xn) 
where T is a relation in T, along with the 
tgds Eq(xi,a; 2 ) —t Eq(a; 2 ,a:i) and Eq(a;i,a: 2 ) A 
Eq{x 2 ,X 3 ) -)• Eq(x 1 , 0 : 3 ). 

Figure [TT] shows equality singularization in action. 

Proposition 6.3. Let A4 = (S,T, Est,Et) be an 
SOTGD-|-(sOTGD, egd) schema mapping and let 
be its equality singularization. For every ^-instance 

I, 

1. I has a solution w.r.t. A4 if and only if I has a 

solution J' w.r.t. such that there is no pair 

of distinct constants a, b where J' \= Eq(a, b). 

2. If I has a solution w.r.t. A4, then for 
every UCQ q over T, certain((jf, 7, Af) = 
certain(g^'i,7,Af^‘i). 

Proof. [=J>] Let J be any T-instance that is a solution 
for I with respect to Ad. Take J' to be the T U 
{ 7 ?( 7 }-instance that extends J with all facts of the 
form Eq(a,a) with a G adom{J). It is easy to see 
that J' is a solution for I with respect to Ad®'^, and 
that, for all UCQs q, we have that qf (J) = (;®'^4'('T'). 
Moreover, it is immediate from the construction of J' 
that there is no pair of distinct constants a, b where 
J' \= Eq(a, b). 







[<;=] Let f be the collection of function symbols 
appearing in Let J be a T U {Eq}-instance 

that is a solution for I with respect to such 

that there is no pair of distinct constants a, b where 
J \= Eq(a, 6). Note that Eq is an equivalence rela¬ 
tion and that each equivalence class contains at most 
one constant (but possibly many null values). Let 
be a witnessing collection of functions, such that 
(/, J) 1= [f !->• f°]. We will construct a T- 

instance J' and a collection of functions, as fol¬ 
lows: 

• Eor every Eq-equivalence class, choose a single rep¬ 
resentative member. If an equivalence class con¬ 
tains a constant, we use that constant as the repre¬ 
sentative member. For every value u € adom{J), 
denote by tt{u) the representative member of the 
Eq-equivalence class to which u belongs. 

• J' contains, for every fact T(ui,...,u„) of 
J (where T G T), the corresponding fact 
T(7r(ui),... ,7r(u„)) 

• contains, for each function / in f°, the corre¬ 
sponding function /' given by /'(u) = 7r(/(u)). 

By construction, we have that, for any image g^'^(a) 
in J of the equality singularization of a conjunctive 
query (/(x), we have an image q{a.) in J', and vice 
versa. This tells us both that J' is a solution for I 
w.r.t. Ai and that for any UCQ q over T, we have 
qi{J') = Additionally, since each Eq-class 

is represented by a single member in J', we have that, 
for each egd in tr G Et, the fact that J satisfies 
implies that J' satisfies a. □ 

The importance of Proposition 16.31 comes from the 
following observation. Consider any SOTGD+SOTGD 
schema mapping Ai = (S, T, Est, Et). Let f be the 
collection of all function symbols occurring in SO tgds 
in Es 4 U Ej. a solution J for a source instance / 
with respect to Ai is said to be a free solution if 
there is a collection of functions f° such that (/, J) |= 
Egi U E^ [f I—f , and such that each function in fO 

is 

injective and the functions all have mutually disjoint 
ranges. Equivalently, in a free solution, each value 
in adom{J) is the denotation of exactly one ground 
term. If, furthermore, we have that each value in 
adom{J) is the denotation of a (unique) term of depth 


k, then we say that J is a free solution of rank k. 

Proposition 6.4. Let Ai be the equality singular¬ 
ization of a weakly acyclic efsotgd-|-SOTGD schema 
mapping. There is a natural number k > 0 such that 
every source instance I has a free universal solution 
J of rank k. 

Proof, (sketch) Let f° be an arbitrary collection of 
injective and mutually range-disjoint functions. Let 
J be the result of chasing / with the SO tgds of Ai 
using these functions. A priori, J is potentially infi¬ 
nite. However, we can show that J is always finite, 
moreover, of finite rank. This is proved by induction: 
we associate to each position {R, i) (where i? is a rela¬ 
tion symbol and i an attribute of R) a rank, namely 
the maximal number of special edges on an incom¬ 
ing path to {R, i) in the dependency graph times the 
maximal depth of a term occurring in the right-hand 
side of an SO tgd. Then, we can prove by a straight¬ 
forward induction on k that for all positions (i?, i) of 
rank A:, each value in position (i?, i) is the denotation 
of a term of depth at most k. □ 

It is not hard to see that the same does not hold 
in the presence of egds. 

The above definition of Ai^‘^ applies only 
to EFSOTGD-|-(sOTGD, egd) schema map¬ 
pings. However, it can be extended to ar¬ 

bitrary SOTGD-|-(sOTGD, egd) schema map¬ 
pings Ai = (S,T, Est,Et) as follows: from 

Ad, we first construct a schema mapping 
Ai' = (S, T', Ecopy, E() where T' = TU{A' | i? G S}; 
Ecopy = {Vx(i?(x) —>• i?'(x)) I R G S}; and 
E( = Et U {a' I a G Egt}, where a' is a copy of 
a in which every occurrence of a relation i? G S 
is replaced by R'. We then define the equality 
singularization to be the equality singulatiza- 

tion of Ad'. It is easy to see that Proposition 16.31 
and Proposition 16.41 then hold true for arbitrary 
SOTGD-|-(sOTGD, egd) schema mappings. 

6.3 The Skeleton Rewriting Step 

Suppose Ad is a weakly acyclic efsotgd-|-efsotgd 
schema mapping, and Ad^‘^ is the equality singular¬ 
ization of Ad. Since Ad®'^ admits free universal solu- 


tions, we can represent the value of every compound 
term simply by its syntax. This makes it possible to 
rewrite in such a way that the syntax of com¬ 

pound terms is captured using specialized relations, 
and constraints with only simple terms. 

The skeleton of a term is the expression obtained 
by replacing all constants and variables by •, where 

• is a fixed symbol that is not a function symbol . 
Thus, for example, the skeleton of f{g{x,y),z) is 
/( 5 (•,•),•)■ The arity of a skeleton s, denoted by 
arity{s), is the number of occurrences of •, and the 
depth of a skeleton is defined in the same way as for 
terms. If s, ..., sj. are skeletons with arity(s) = k, 
then we denote by s(s'i,..., s'^) the skeleton of arity 
arity{s[) -I- • ■ • -I- arity{s'^) obtained by replacing, for 
each i < k, the Tth occurrence of • in s by s'. 

Definition 6.2. Let A4 = (S, T, Sgt, St) be a weakly 
acyclic efsotgd+efsotgd schema mapping with 
rank r and whose most deeply nested term has depth 
d. Let 0 be the set of functions appearing in Et. De¬ 
fine the skeleton rewriting of A4 as the schema map¬ 
ping where: 

• For every n-ary relation T £ T, let contain 

all relations of the form where Si,... s^ 

are skeletons of depth less than or equal to r. 

• For every clause (^(x) ^ T(ri,..., Tn) of a s-t ef- 
SOTGD in Sst, let Sgt®' contain the s-t tgd (j){x) 
Tsi,...,s„(x), where Si,..., s„ are the skeletons for 
Ti,... ,Tn respectively, and x is the sequence of 
variables in n,..., Tn. 

• For every clause ^(x) —>• T(ti,...,t„) of a 

EFSOTGD in St (where x = xi,...,Xm), let 

contain the tgd ■ ■ ■ ,ym) ^ 

Ts;,...,s;,(yi, • ■ -, 5 ^ 11 ), where 

- si,..., Sm are 0-skeletons of depth at most r * 

d; 

- each Yi is a sequence of arity (st) fresh variables; 

- 'Psi,...,smiyi,---,ym) is obtained from (p 
by replacing each atom R(a;i,..., a:/) by 

- s' is a 0-skeleton of depth at most r * d such 
that s' = Sk (if Ti is the term Xk) and s' = 
U{si, ..., Sm) (if Ti is the term ti{xi, ... ,Xm)) 


- yi = iyk^,■■■,yKr^ty(.^)) (t 

Ti is the term Xk) and yi = 

iUll ) y^l > y™arity(sm) ) 

Ti is the term ti{xi, ... ,Xm)) 

In addition, for each conjunctive query 
q{x) = 3y'(/;(x, y) over T with x = xi,...,a:„ 
and y = yi,---,ym, we denote by g®’®®^(x) the 
union of conjunctive queries over T®'®®* of the form 

(^1; ■ • ■ ; ;Zi, . . . ,Zni), 

where si = ... = Sn = •', s'l,..., s'm are 0-skeletons 
of depth at most r; and each z^ is a sequence of fresh 
variables of length arity{s'i). 

For example, consider the efsotgd-|-efsotgd 
schema mapping M whose constraints 
are 3fVx,y P{x,y) Q{f{y),y,y) and 

3gyx,y,z Q{x,y,z) Q{x,y,g{x,y)). 

Then aVf®*®®* will include the s-t tgd 
P{x,y) -)■ Qf{,).,.,ix,y,y), and the target 

tgds Q,^,^,{x,y,z) Q,,,,g{,,,){x,y,x,y), and 

Q z) V Q f [•) .• {•) ^•){x •, y ■i X y) . The 

full skeleton rewriting of our running example 
schema mapping is given in Figure IT^ 

Remark. An optimized version of the schema map¬ 
ping in Figure [T^] is shown in Figure 1131 based on 
the simple observation that in Figure [12] none of 
{T,^,,Ty(,^,)^,,T/(,^,)j(,^,)} appears on the right- 
hand side of any tgd, and thus the left-hand sides of 
many tgds cannot be satisfied in any universal solu¬ 
tion, and in turn none of {Eq,Eq^(, ever 
appears on the right-hand side of a remaining tgd in 
which it does not also appear on the left-hand side. 
We leave development of a principled approach to 
optimization for future work. 

Proposition 6.5. Let M = (S,T, Est,St) be a 
weakly acyclic efsotgd-|-efsotgd schema mapping. 
Let Ad®*®®^ he the skeleton rewriting of A4. For every 
UCQ q over T and for every S-instance I, we have 
certain(q, I, M) = certain(g®^®', I, M'). 

Hint. We show that there exists a solution J for I 
with respect to Ai if and only if there exists a so¬ 
lution J' for L with respect to Furthermore, 

J' (respectively J) can be constructed such that for 
any UCQ q over T, we have qUJ) = g®^®'4.(J'). To 
construct J from J', we copy every tuple, and use the 


skeletons and their arguments to construct the com¬ 
pound terms. To construct J' from J, we copy every 
tuple, and, using a witnessing collection of functions 
f° such that (/, J) \= M. [i ^ fo], and such that each 
null value is the denotation of a unique term of depth 
at most r. This term gives us both the skeleton and 
the arguments that belong in J'. □ 

6.4 Proof of Theorem 16.21 

We finally can prove Theorem 16.21 by combining the 
above results: Let M. — (S, T, Sgt, St) be a weakly 
acyclic SOTGD+(sotgd, egd) schema mapping, and 
let M. be the skeleton rewriting of the equality sin- 
gularization of M , extended with the egd 

Eq,^,(a:,j/) x = y . 

Furthermore, for any UCQ q over T, let q be 
the skeleton rewriting of the equality simulation 
of q. Then we claim that XR-certain(g, I, M) = 
XR-certain(g, I, M). 

It suffices to show that, for all source instances /, 

1 . / has a solution with respect to if and only if 
/ has a solution with respect to M.. 

2 . if / has a solution with respect to M, then, 
for all UCQs q over T, certain(( 7 ,/, AI) = 
certain(q, /, Ad) 

The first item follows from Proposition 16.,Il al and 
Proposition l6.5l The second item follows from Propo¬ 
sition |ffi3lb) and Proposition [6^ 

6.5 Related Work 

Theorem 16.21 allows us to extend the DLP-rewriting 
technique of Section [5] to GLAV-|-(waglav, egd) 
schema mappings (and, in fact, to weakly acyclic 
SOTGD-|-(sOTGD, egd) schema mappings). The 
proof is based on a method for eliminating the exis¬ 
tentially quantified variables. Others have considered 
methods for eliminating existential quantifiers from 
tgds previously, an early example being Duschka and 
Genesereth’s inverse rules algorithm for acyclic 
LAV rules, which inspired our approach. Krotzsch 
and Rudolph describe an existentially quantified vari¬ 
able elimination procedure for schema mappings com¬ 


posed of GLAV constraints and relational denial con¬ 
straints (a subset of denial constraints with no equal¬ 
ity or inequality atoms) that are jointly-acyclic (a 
relaxation of weak acyclicity) in [27] . Their approach 
is similar to ours in that it creates extra attributes to 
represent skolem terms in place of existentially quan¬ 
tified variables, but our constraint language includes 
the additional expressiveness of egds, whose careful 
handling is a primary concern of our approach. Mar- 
nette studied termination of the chase for schema 
mappings with target constraints in [35) . where he 
introduced the oblivious skolem chase, a modifica¬ 
tion of the chase procedure in which skolem terms 
are allowed to appear in instances. A similar proce¬ 
dure was used to prove the correctness of a limited 
form of skeleton rewriting in [38] . 

Equality singularization for tgds was introduced 
in |35] . where it was referred to simply as “singu¬ 
larization” . In |38j , another equality simulation tech¬ 
nique was used, based on substitution. In that pre¬ 
sentation, the simulation was woven into the skeleton 
rewriting step. 

Theorem 16.21 is related to a result in an unpub¬ 
lished manuscript [36], which can be stated as fol¬ 
lows: given any GLAV-|-waglav schema mapping Ai 
and every conjunctive query q, one can compute a 
Datalog program that, given any source instance as 
input, computes the certain answers of q with respect 
to M. Note that, conceptually, a Datalog program 
can be viewed as a GAV-|-GAV schema mapping where 
the source schema consists of the EDB predicates and 
the target schema consists of the IDB predicates. 

7 Concluding Remarks 

In this paper, we introduced the framework of 
exchange-repairs and explored the XR-certain an¬ 
swers as an alternative non-trivial and meaningful 
semantics of queries in the context of data exchange. 
Exchange-repair semantics differ from other propos¬ 
als for handling inconsistencies in data exchange in 
that, conceptually, the inconsistencies are repaired 
at the source rather than the target. This allows the 
shared origins of target facts to be reflected in the 
answers to target queries. 




This framework brings together data exchange, 
database repairs, and disjunctive logic programming, 
thus enhancing the interaction between three differ¬ 
ent areas of research. Moreover, the results reported 
here pave the way for using DLP solvers, such as 
DLV, for query answering under the exchange-repair 
semantics. 

8 Acknowledgements 

The research of all authors was partially supported 
by NSF Grant IIS-1217869. Kolaitis’ research was 
also supported by the project “Handling Uncertainty 
in Data Intensive Applications” under the program 
THALES. 

References 

[1] F. N. Afrati and P. G. Kolaitis. Repair checking 
in inconsistent databases: algorithms and com¬ 
plexity. In R. Fagin, editor, ICDT, volume 361 
of ACM International Conference Proceeding Se¬ 
ries, pages 31-41. AGM, 2009. 

[2] M. Alviano, W. Faber, N. Leone, S. Perri, 
G. Pfeifer, and G. Terracina. The disjunctive 
datalog system dlv. In Datalog, pages 282-301, 
2010 . 

[3] M. Arenas, P. Barcelo, L. Libkin, and F. Murlak. 
Relational and XML Data Exchange. Synthesis 
Lectures on Data Management. Morgan & Clay- 
pool Publishers, 2010. 

[4] M. Arenas, L. E. Bertossi, and J. Chomicki. 
Consistent query answers in inconsistent 
databases. In V. Vianu and C. H. Papadim- 
itriou, editors, PODS, pages 68-79. ACM Press, 
1999. 

[5] M. Arenas, R. Fagin, and A. Nash. Composi¬ 
tion with target constraints. Logical Methods in 
Computer Science, 7(3), 2011. 

[6] M. Arenas, J. Perez, J. L. Reutter, and 
C. Riveros. The language of plain so-tgds: Com¬ 


position, inversion and structural properties. J. 
Comput. Syst. Sci., 79(6):763-784, 2013. 

[7] L. E. Bertossi. Database Repairing and Con¬ 
sistent Query Answering. Synthesis Lectures on 
Data Management. Morgan & Claypool Publish¬ 
ers, 2011. 

[8] L. E. Bertossi, J. Chomicki, A. Cortes-Calabuig, 
and C. Gutierrez. Consistent answers from inte¬ 
grated data sources. In T. Andreasen, A. Motro, 
H. Christiansen, and H. L. Larsen, editors, 
FQAS, volume 2522 of Lecture Notes in Com¬ 
puter Science, pages 71-85. Springer, 2002. 

[9] M. Bienvenu. On the complexity of consistent 
query answering in the presence of simple ontolo¬ 
gies. In Proceedings of the Twenty-Sixth AAAI 
Conference on Artificial Intelligence, July 22-26, 
2012, Toronto, Ontario, Canada., 2012. 

[10] M. Bienvenu, C. Bourgaux, and F. Goasdoue. 
Querying inconsistent description logic knowl¬ 
edge bases under preferred repair semantics. In 
Proceedings of the Twenty-Eighth AAAI Confer¬ 
ence on Artificial Intelligence, July 21 -31, 2014, 
Quebec City, Quebec, Canada., pages 996-1002, 
2014. 

[11] M. Bienvenu and R. Rosati. Tractable approxi¬ 
mations of consistent query answering for robust 
ontology-based data access. In IJCAI2013, Pro¬ 
ceedings of the 23rd International Joint Confer¬ 
ence on Artificial Intelligence, Beijing, China, 
August 3-9, 2013, 2013. 

[12] L. Bravo and L. E. Bertossi. Logic programs for 
consistently querying data integration systems. 
In Gottlob and Walsh [^, pages 10-15. 

[13] A. Cali, D. Lembo, and R. Rosati. On the 
decidability and complexity of query answering 
over inconsistent and incomplete databases. In 
F. Neven, C. Beeri, and T. Milo, editors, PODS, 
pages 260-271. ACM, 2003. 

[14] A. Cali, D. Lembo, and R. Rosati. Query rewrit¬ 
ing and answering under constraints in data in¬ 
tegration systems. In Gottlob and Walsh [23] , 
pages 16-21. 


[15] D. Calvanese, G. De Giacomo, D. Lembo, 
M. Lenzerini, A. Poggi, and R. Rosati. 
Ontology-based database access. In Proceed¬ 
ings of the Fifteenth Italian Symposium on Ad¬ 
vanced Database Systems, SEED 2007, 17-20 
June 2007, Torre Canne, Fasano, BR, Italy, 
pages 324-331, 2007. 

[16] J. Chomicki and J. Marcinkowski. Minimal- 
change integrity maintenance using tuple dele¬ 
tions. Inf Comput, 197(l-2):90-121, 2005. 

[17] O. M. Duschka and M. R. Genesereth. An¬ 
swering recursive queries using views. In A. O. 
Mendelzon and Z. M. Ozsoyoglu, editors, PODS, 
pages 109-116. AGM Press, 1997. 

[18] B. ten Cate, G. Fontaine, and P. G. Kolaitis. On 
the data complexity of consistent query answer¬ 
ing. In A. Deutsch, editor, ICDT, pages 22-33. 
ACM, 2012. 

[19] R. Fagin, P. G. Kolaitis, R. J. Miller, and 
L. Popa. Data exchange: semantics and query 
answering. Theor. Comput. Sci., 336(1):89-124, 
2005. 

[20] R. Fagin, P. G. Kolaitis, L. Popa, and W. C. Tan. 
Composing schema mappings: Second-order de¬ 
pendencies to the rescue. ACM Trans. Database 
Syst, 30(4):994-1055, 2005. 

[21] A. Fuxman and R. J. Miller. First-order query 
rewriting for inconsistent databases. J. Comput. 
Syst. Sci., 73(4):610-635, 2007. 

[22] M. Gelfond and V. Lifschitz. The stable model 
semantics for logic programming. In R. A. 
Kowalski and K. A. Bowen, editors, ICLP/SLP, 
pages 1070-1080. MIT Press, 1988. 

[23] G. Gottlob and T. Walsh, editors. I.JCAI- 
03, Proceedings of the Eighteenth International 
Joint Conference on Artifieial Intelligence, Aea- 
pulco, Mexieo, August 9-15, 2003. Morgan Kauf- 
mann, 2003. 

[24] G. Grahne and A. Onet. Data correspondence, 
exchange and repair. In L. Segoufin, editor, 


ICDT, ACM International Conference Proceed¬ 
ing Series, pages 219-230. ACM, 2010. 

[25] T. Janhunen and E. Oikarinen. Capturing par¬ 
allel circumscription with disjunctive logic pro¬ 
grams. In J. J. Alferes and J. A. Leite, editors, 
JELIA, volume 3229 of Lecture Notes in Com¬ 
puter Science, pages 134-146. Springer, 2004. 

[26] P. G. Kolaitis, J. Panttaja, and W. C. Tan. The 
complexity of data exchange. In S. Vansum- 
meren, editor, PODS, pages 30-39. ACM, 2006. 

[27] M. Krbtzsch and S. Rudolph. Extending de¬ 
cidable existential rules by joining acyclicity 
and guardedness. In T. Walsh, editor, IJ- 
CAI 2011, Proceedings of the 22nd Interna¬ 
tional Joint Conference on Artificial Intelli¬ 
gence, Barcelona, Catalonia, Spain, July 16-22, 
2011, pages 963-968. IJCAI/AAAI, 2011. 

[28] D. Lembo, M. Lenzerini, and R. Rosati. Source 
inconsistency and incompleteness in data inte¬ 
gration. In A. Borgida, D. Calvanese, L. Cholvy, 
and M. Rousset, editors, Proceedings of the 
9th International Workshop on Knowledge Rep¬ 
resentation meets Databases (KRDB 2002), 
Toulouse France, April 21, 2002, volume 54 of 
CEUR Workshop Proeeedings. CEUR-WS.org, 
2002 . 

[29] D. Lembo, M. Lenzerini, R. Rosati, M. Ruzzi, 
and D. F. Savo. Inconsistency-tolerant semantics 
for description logics. In Web Reasoning and 
Rule Systems - Eourth International Conferenee, 
RR 2010, Bressanone/Brixen, Italy, September 
22-24, 2010. Proceedings, pages 103-117, 2010. 

[30] D. Lembo and M. Ruzzi. Consistent query an¬ 
swering over description logic ontologies. In 
M. Marchiori, J. Z. Pan, and C. de Sainte Marie, 
editors, Web Reasoning and Rule Systems, Eirst 
International Conference, RR 2007, Innsbruck 
, Austria, June 7-8, 2007, Proceedings, vol¬ 
ume 4524 of Lecture Notes in Computer Science, 
pages 194-208. Springer, 2007. 

[31] M. Lenzerini. Data integration: A theoretical 
perspective. In L. Popa, S. Abiteboul, and P. G. 



Kolaitis, editors, PODS, pages 233-246. ACM, 
2002 . 

[32] N. Leone, G. Pfeifer, W. Faber, T. Eiter, G. Got¬ 
tlob, S. Perri, and F. Scarcello. The dlv sys¬ 
tem for knowledge representation and reasoning. 
ACM Trans. Comput. Log., 7(3):499-562, 2006. 

[33] T. Lukasiewicz, M. V. Martinez, A. Pieris, and 
G. 1. Simari. From classical to consistent query 
answering under existential rules. In Proceedings 
of the Twenty-Ninth AAAI Conference on Arti¬ 
ficial Intelligence, January 25-30, 2015, Austin, 
Texas, USA., pages 1546-1552, 2015. 

[34] M. C. Marileo and L. E. Bertossi. The consis¬ 
tency extractor system: Answer set programs for 
consistent query answering in databases. Data 
Know! Eng., 69(6):545-572, 2010. 

[35] B. Marnette. Generalized schema-mappings: 
from termination to tractability. In J. Paredaens 
and J. Su, editors, PODS, pages 13-22. ACM, 
2009. 

[36] B. Marnette. Resolution and datalog rewriting 
under value invention and equality constraints. 
CoRR, abs/1212.0254, 2012. 

[37] R. Rosati. On the complexity of dealing with in¬ 
consistency in description logic ontologies. In 
IJCAI 2011, Proceedings of the 22nd Inter¬ 
national Joint Conference on Artificial Intelli¬ 
gence, Barcelona, Catalonia, Spain, July 16-22, 
2011, pages 1057-1062, 2011. 

[38] B. ten Cate, R. L. Halpert, and P. G. Ko¬ 
laitis. Exchange-repairs: Managing inconsis¬ 
tency in data exchange. In R. Kontchakov and 
M.-L. Mugnier, editors, RR, volume 8741 of Lec¬ 
ture Notes in Computer Science, pages 140-156. 
Springer, 2014. 



S ={R} 

T = T, Tj(, Ty(, ,) Eq, ,, Eq, Eq^rEq^j-, jj-, | 

^ J R(x,y) T,j(,^,)(x,x,y) 1 
\ R(a;,J/) ^ T,j(. .)(y,x,y) J 
' T,,,(x, y) A Eq, .(a;, x') A T,,,(x', y') Eq, ,(y, y') 

T.,.(x,y) AEq, ,(a;,x') AT.j(,^,)(x',yi,j/^) -5> Eq. j/^) 

T,,,(x,y) AEq, y(,_,)(a;,xi,xy A X2, j/') Eq, ,{y,y') 

T.,.(x,y) AEq, y(,_,)(a;,x'i,xy AT/(._.)j(,^.)(x'i,x^,yi,j/^) -5- Eq. y(, ,)(j/,j/^) 

T,j(,,,)(x,j/i,j/ 2) AEq._.(x,x') AT,,,(x',i/') Eq^(._.)_.(yi, 1/2, y') 

T.j(.,.)(x,j/i,j/ 2) AEq._.(x,x') AT.j(.^.)(x',i/i,yQ -5- Eqj(._.) j(. .)(j/i,ya, J/i,yQ 
T,,/(.,.)(x, j/i, j/2) A Eq. j(. .)(x, x'l, X2) A T f(,^,)^,{x{,x'2,y') Eqj(._.)_.(yi, j/2, y') 

T.j(.,.)(x,j/i,j/ 2) AEq.j(. AT/(.^.)j(._.)(xi,x^,yi,yQ Eqy(._.)j(. ya, J/i,yQ 

T f(,^,)^,{xi,X2, y) A Eq^(. .) .(xi, X2, x') A T,^.(x', y') Eq._.(j/, y') 

T/(.,.),.(xi,X 2,J/) AEq^(. .) .(xi,X2,x') AT.j(._.)(x',j/i,j/^) Eq. j(. j/') 

T/(,,,),,(xi, X2, j/) A Eq^(. .) j(._.)(xi, X2, x'^, Xa) A T/(._.)_.(xi, Xa, y') Eq._.(j/, y') 

T/(.,.),.(xi, xa, j/) A Eq^(. .) y(._.)(xi, xa, x'^, x^) A T/(._.) j(.^.)(xi, x^, j/i, Eq. j(. .) {y, y[, j/') 

T/(,,,) j(,,,)(xi, xa, j/i, ya) A Eq^(. .) .(xi, xa, x') A T,^,(x', y') Eqj(._.)_.(j/i, ya, y') 

T/(.,.) j(.,.)(xi, xa, j/i, ya) A Eq^(. .) .(xi, xa, x') A T. j(._.)(x', j/^) -5> Eq^(. .) j/a, yi, yQ 

(xi, Xa 12/1; ^2) ^ Eq^^^. jf-. .j (^1 ? ^25 1 ^2) ^ ^2? ^ ^ Eqjf'. .j . {yi-, y2: y ) 

T/(.,.) j(.,.)(a;i, Xa, yi,y 2 ) A Eq^(. .) j(._.)(xi, xa, x'^, Xa) A T(x'l, Xa, yi, j/a) 

Eq/(.,.)j(.,.)(j/i,y2,yi,yQ 

T,,,(x,y) -5> Eq. .(x,x) 

EfeU T,,,(x, y) Eq._.(y, y) 

T,j(,,,)(x,j/i,j/2) - 5- Eq. .(x,x) 

T.j(.,.)(x,j/i,j/ 2) Eqy(._.) j(. .)(j/i,y2,2/i,y2) 

T/(,,,),,(xi,X 2,J/) Eqj(._.) j(. .)(xi,X2,Xi,X2) 

T/(,,,),,(xi,X2,j/) Eq. .(y,j/) 

Xa, yi, ^2) ^ Eqjf'. .j (xi, Xa, Xi, Xa) 

T/(.,.)j(.,.)(xi,X2,j/i,y2) Eqj(._.) j(. .)(yi,y2,2/i,J/2) 

Eq,,,(a;i,a;2) -> Eq. .(X2,xi) 

Eq,j(,,,)(a;i,X2i,X2j Eqj(._.)_.(x2i,xa^,xi) 

Eq/(,,,),,(a;q,xi2,X2) Eq.j(._.)(x2,xi^,xij 

Eq^^. (xii, X12, Xa^, X22) ^ Eq^^. (^2i ? ^22 ? ^ii ? ^12) 

Eq,,,(a;i,X2) A Eq. .(x2,X3) Eq. .(xi,X3) 

Eq._.(xi,X2) AEq. j(._.)(x2,X3i,X 32) Eq.j(. .)(xi,X3i,X32) 

Eq,j(,,,)(a;i,X2i,X22) AEq^(. .) .(x2i,X22,a;3) Eq. .(xi,X3) 

Eq.,/(.,.) {xi , X2i, X22) A Eq^(. .) (xa^, X22, x^^ ,X3^) ^ Eq. j(. .) (xi, X31, X32) 

Eq/(,,,),,(a;ii,Xi2,X2) AEq._.(x2,X3) Eq^(. .) .(xi^, X12, X3) 

Eq/(,,,),,(a;q, X12, xa) A Eq. j(. .)(xa, X31 ,X3^) ^ Eqj(._.) j(. .)(xq, X12, X31, X32) 

Eq/(,,,) , X12, xa^, X22) A Eqj(._.)_.(x2i > 2:22,2:3) ^ Eq^(. .) .(xi^, X12, X3) 

^ ,/(•,•) ; ^2i 7 ^22) ^ Eqjf". .j(^2i 1X221 Xsi , X32) y Eqy (^ii 5^121 ^ 3 i 1X32 ) 

reachable**^®'(x, j/) := 3c, c'(T, .(x, c) A Eq. .(c, c') A T...(y, c')) 

U 3 c,Ci,C 2 (T.,,(x,c) AEq. j(. .)(c,Ci,C2) A T.j(.^.)(y, c'^, Ca)) 

U 3ci,C2,c'(T.j(._.)(x, ci,C2) AEqj(. .) .(ci,C2,c') AT,^,{y,c')) 

U 3ci, Ca, C3, Ca (T», (x, ci, ca) A Eqy ^(ci, ca, c^^, Ca) A T. y(y,^3,02)) 


Eigure 12: Undirected reachability, skolemized, equality singularized, and skeleton rewritten. 






S = {R} 

T = |t, j(, Eq, ,, Eqj(, ,j ,)| 

^ f R(x,y) T,j(,_,)(x,x,j/) 

R(a:,2/) -!> T,j(,^,)(y,x,j/) 

T.j(.^.)(x,2/i, 2/2) AEq, ,(a:,a:') AT,j(, ,)(a;',?/;,2/Q -5> Eqj(, ,) 7/2,2/i, ^2) 

T,j(.,,)(x,j/i,j/ 2) -S' Eq, ,(x,x) 

T.j(.,.)(a;,j/i,j/2) Eqy(, ,) j(, ,)(j/i,y2,yi,y2) 

Eq,,,(a;i,a;2) -)• Eq, ,(0:2, xi) 

(xii, XI 2 , , 3^22) S' Eq^^^^^^ (*S'2i; ^ 22 ; , XI 2 ) 

Eq,,,(a:^i,a:2) A Eq, ,(x2,X3) Eq, .(xijXa) 

/(•,•) (^ll I •^12 ) •^2i) 2^22 ) A (•*S'2i) SE 22 , , X 32 ) S Eq^j^^^j (-^li ) 2-12 J 2-31) •^32 ) 

reachable''^''*(x,j/) := 3ci, C2, c^, ^(T.(x, ci, C2) A Eq^.^, ,) .^(ci, C2, 4,4) A T, j(. .)(y, 4,4)) 


j^skel ^ 


Figure 13: Example schema mapping from Figure |9l skolemized, equality singularized, skeleton rewritten, 
and optimized. 







